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Abstract. Several construction methods for rational approximations to functions 
of one real variable are described in the present paper; the computational results that 
characterize the comparative accuracy of these methods are presented; an effect of 
error autocorrection is considered. This effect occurs in efficient methods of ratio- 
nal approximation (e.g., Pade approximations, linear and nonlinear Pade-Chebyshev 
approximations) where very significant errors in the coefficients do not affect the 
accuracy of the approximation. The matter of import is that the errors in the nu- 
merator and the denominator of a fractional rational approximant compensate each 
other. This effect is related to the fact that the errors in the coefficients of a rational 
approximant are not distributed in an arbitrary way but form the coefficients of a new 
approximant to the approximated function. Understanding of the error autocorrec- 
tion mechanism allows to decrease this error by varying the approximation procedure 
depending on the form of the approximant. Some applications arc described in the 
paper. In particular, a method of implementation of basic calculations on decimal 
computers that uses the technique of rational approximations is described in the 
Appendix. 

To a considerable extent the paper is a survey and the exposition is as elementary 
as possible. 



Contents 

§1. Introduction 2 

§2. Best approximants 4 

§3. Construction methods for best approximants 6 
§4. The role of approximate methods and an estimate of the 

quality of approximation 8 
§5. Chebyshev polynomials and polynomial approximations 10 
§6. Ill-conditioned problems and rational approximations 13 
3 7. The effect of error autocorrection 15 
§8. Pade approximations 17 
§9. Linear Pade-Chebyshev approximations and the PADE pro- 
gram 19 
§10. The PADE program. Analysis of the algorithm 22 
§11. The "cross-multiplied" linear Pade-Chebyshev approxima- 
tion scheme 26 
§12. Nonlinear Pade-Chebyshev approximations 27 



Published in: Russian Journal of Mathematical Physics, vol.1, No. 3, 1994. 



1 



Typeset by ^5-TeX 



§13. Applications of the computer algebra system REDUCE to 

the construction of rational approximants 28 
§14. The effect of error autocorrection for nonlinear Pade-Cheby- 

shev approximations 29 
§15. Small deformations of approximated functions and accelera- 
tion of convergence of series 32 
§16. Applications to computer calculation 33 
§17. Nonlinear models and rational approximants 34 
APPENDIX. A method of implementation of basic calculations 
on decimal computers 35 

1. Introduction 35 

2. Floating point arithmetic system 36 

3. The design of computation for elementary functions 36 

4. Algorithms 37 

4.1. Calculation of logarithms 37 

4.2. Calculation of exponentials 38 

4.3. Calculation of sinx and cos a; 38 

4.4. Calculation of tgx 39 

4.5. Calculation of arctgx 39 

4.6. Calculation of arcsinx 39 

5. Coefficients 39 

6. Analysis of the algorithms 39 

7. Implementation of algorithms for calculating elementary 
functions 40 

References 43 



Whenever he has some money to spare, he goes to a 
shop and buys some kind of useful book. Once he bought a 
book that was entitled "Inverse trigonometrical functions 
and Chebyshev polynomials" . 

N.N.Nosov "Happy family". Moscow, 1975, p. 91 

§1. Introduction 

The author came across the phenomenon of error autocorrection at the end 
of seventies while developing nonstandard algorithms for computing elementary 
functions on small computers. It was required to construct rational approximants 
of the form 

a + Qix + a 2 x 2 j h a„x" 

[ ' [X) ~ b + fox + b 2 x 2 + ■■■ + b m x m 

to certain functions of one variable x defined on finite segments of the real line. For 
this purpose a simple method (described in [1] and below) was used: the method 
allows to determine the family of coefficients a,, bj of the approximant (1) as the 
solution of a certain system of linear algebraic equations. These systems turned 
out to be ill conditioned, i.e., the problem of determining the coefficients of the 
approximant is, generally speaking, ill-posed in the sense of [2]. Nevertheless, the 
method ensures a paradoxically high quality of the obtained approximants whose 
errors are close to the best possible [1]. 
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For example, for the function cos a; the approximant of the form (1) on the 
segment [— 7r/4, 7r/4] obtained by the method mentioned above for m = 4, n = 6 
has the relative error equal to 0.55 • 1CP 13 , and the best possible relative error 
is 0.46 • 10~ 13 [3]. The corresponding system of linear algebraic equations has 
the condition number of order 10 9 . Thus we risk losing 9 accurate decimal digits 
in the solution of calculation errors. Computer experiments show that this is a 
serious risk. The method mentioned above was implemented as a Fortran program. 
The calculations were carried out with double precision (16 decimal positions) by 
means of ICL-4-50 and ES-1045 computers. These computers are very similar in 
their architecture, but when passing from one computer to another the system of 
linear equations and the computational process are perturbed because of calculation 
errors, including round-off errors. As a result, the coefficients of the approximant 
mentioned above to the function cos x experience a perturbation already at the 
sixth ninth decimal digits. But the error in the rational approximant itself remains 
invariant and is 0.4 • 10~ 13 for the absolute error and 0.55 • 10~ 13 for the relative 
error. The same thing happens for approximants of the form (1) to the function 
arctgx on the segment [-1,1] obtained by the method mentioned above for to = 8, 
n = 9 the relative error is 0.5 • lO^ 11 and does not change while passing from ICL- 
4-50 to ES-1045 although the corresponding system of linear equations has the 
condition number of order 10 , and the coefficients of the approximant experience 
a perturbation with relative error of order 10~ 4 . 

Thus the errors in the numerator and the denominator of a rational approximant 
compensate each other. The effect of error autocorrection is connected with the 
fact that the errors in the coefficients of a rational approximant are not distributed 
in an arbitrary way, but form the coefficients of a new approximant to the approxi- 
mated function. It can be easily understood that the standard methods of interval 
arithmetic (see, for example, [54] ) do not allow to take into account this effect and, 
as a result, to estimate the error in the rational approximant accurately. 

Note that the application of standard procedures known in the theory of ill-posed 
problems results in this case in losses in accuracy. For example, if one applies the 
rcgularization method, two thirds of the accurate figures are lost [4]; in addition, 
the amount of calculations increases rapidly. The matter of import is that the exact 
solution of the system of equations in the present case is not the ultimate goal; the 
aim is to construct an approximant which is precise enough. This approach allows 
to "rehabilitate" (i.e., to justify) and to simplify a number of algorithms intended 
for the construction of the approximant, to obtain (without additional transforms) 
approximants in a form which is convenient for applications. 

Professor Yudell L. Luke kindly drew the author's attention to his papers [5, 6] 
where the effect of error autocorrection for the classical Pade approximants was 
revealed and was explained at a heuristic level. The method mentioned above leads 
to the linear Pade-Chebyshev approximants if the calculation errors are ignored. 

In the present paper, using heuristic arguments and the results of computer 
experiments, the error autocorrection mechanism is considered for quite a general 
situation (linear methods for the construction of rational approximants, nonlinear 
generalized Pade approximations). The efficiency of the construction algorithms 
used for rational approximants seems to be due to the error autocorrection effect 
(at least in the case when the number of coefficients is large enough) . 

Our new understanding of the error autocorrection mechanism allows us, to 
some extent, to control calculation errors by changing the construction procedure 
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depending on the form of the approximant. 

In the paper the construction algorithm for linear Pade-Chebyshev approxi- 
mants is considered and the corresponding program is briefly described (see [7]). It 
is shown that the appearance of a control parameter allowing to take into account 
the error autocorrection mechanism ensures the decrease of the calculation errors 
in some cases. Results of computer calculations that characterize the possibilities 
of the program and the quality of the approximants obtained as compared to the 
best ones are presented. Some other (linear and nonlinear) construction meth- 
ods for rational approximants are described. Construction methods for linear and 
nonlinear Pade-Chebyshev approximants involving the computer algebra system 
REDUCE (see [8]) are also briefly described. Computation results characterizing 
the comparative precision of these methods are given. With regard to the error 
autocorrection phenomenon the effect described in [9] and connected with the fact 
that a small variation of an approximated function can lead to a sharp decrease 
in accuracy of the Pade-Chebyshev approximants is analyzed. Some applications 
are indicated. In particular, a method of implementation of basic calculations on 
decimal computers that uses the technique of rational approximations is described 
in the Appendix. 

To a considerable extent the paper is a survey and the exposition is as elemen- 
tary as possible. In the survey part of the paper we tried to present the required 
information clearly and consistently, to make it self-contained. But this part does 
not claim to be complete: the number of papers concerning rational approximations 
theory and its applications in numerical analysis (including computer calculation of 
functions, numerical solving of equations, acceleration of convergence of series, and 
quadratures), in theoretical and experimental physics (including quantum field the- 
ory, scattering theory, nuclear and neutron physics), in the theory and practice of 
experimental data processing , in mechanics, in control theory, and other branches 
is much too vast; see, in particular, the reviews and reference handbooks [3, 10-16]. 

The author is grateful to Yudell L. Luke for stimulating conversations and valu- 
able instructions. The author also wishes to express his thanks to I. A. Andreeva, 
A. Ya. Rodionov and V. N. Fridman who participated in the programming and 
organization of computer experiments. This paper would not have been written 
without their help. A preliminary version of the paper was published in [56] . 

§2. Best approximants 

We shall need some information and results pertaining to ideas of P. L. Chcby- 
shev, see [17]. Let [A,B] be a real line segment (i.e., [A,B] is the set of all real 
numbers x such that A ^ x ^ B) and f(x) be a continuous function defined on this 
segment. Consider the absolute error function of the approximant of the form (1) 
to the function f(x), i.e., the quantity 

(2) A(x) = f{x) - R(x), 

and the absolute error of this approximant, i.e., the quantity 

A classical problem of approximation theory is to determine, for fixed degrees m 
and n in (1), the coefficients in the numerator and the denominator of expression (1) 
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(3) 



so that (3) is the smallest possible. The corresponding approximant is called best 
(with respect to the absolute error). An important role is played by the following 
result. 

Generalized de la Vallee Poussin theorem [17]. If the polynomials 

P{x) = a + d\x H h d n ^ l ,x n ~ v , 

(4) _ _ 

Q(x) = b + hx + --- + b m -^x m -v, 

where O^^i^m, O^f^n, & r „_ M ^ 0, have no common divisor (i.e., the fraction 
R(x) — P(x)/Q(x) is irreducible), the expression R{x) = P{x) / Q(x) is finite on the 
segment [A, B], and at successive points x\ < X2 < ■ ■ ■ < x^ of the segment [A, B] 
the error function A(x) = f(x) — R(x) of the approximant R takes nonzero values 
Ai, — A2, • • • , (— 1) N Xn with alternating signs (so that the numbers \ are either all 
positive or all negative ), A = m + n + 2 — d, where d is the smallest of the numbers 
[i, v, then the error A of any approximant of the form (1) satisfies the inequality 

(5) A> A = min{|A 1 |,|A 2 |,...,|A jV |}. 

Proof. Suppose that there exists an approximant R{x) of the form (1) for which 
the inequality (5) is not satisfied. Consider the difference 

e(x) = R(x) - R(x) = [f(x) - R(x)] - [f(x) - R(x)] = A(x) - A(x). 

From our assumption it follows that the numbers s(xi), £(#2), . . . , s(xn) differ from 
zero and have alternate signs. And this, in its turn, by virtue of the continuity of 
the function e(x) on the segment [A, B] implies that the function e(x) has at least 
A — l = m + n+ l — d zeros inside the segment [A, B]. On the other hand, the 
definition of the function e(x) implies the equality e(x) — U(x)/V(x), where U(x) 
and V(x) are polynomials and the degree of U{x) does not exceed m + n — d. So the 
function e(x) cannot have more than A — 2 = m + n — d zeros. This contradiction 
proves the theorem. 

The quantity d which is mentioned in the theorem is called the defect of the 
approximant R(x); in practice usually d = fx = v = 0. The generalized de la 
Vallcc-Poussin theorem gives us a sufficient condition for the approximant R(x) = 
P(x)/Q(x), where P{x) and Q{x) are polynomials of the form (4), to be best. The 
points x\ < x 2 < ■ ■ ■ < xn of the segment [A,B] are called Chebyshev alternation 
points for the approximant R(x) if the error function A(x) = f(x) — R{x) at these 
points has values which coincide with the absolute error A of the approximant R in 
absolute value and are alternate in sign. In other words, at the points x\,...,Xn 
the error function A(x) has extrema with alternating signs which coincide with 
each other in absolute value. From the generalized de la Vallee Poussin theorem, 
it follows that the presence of Chebyshev alternation points is sufficient for the 
approximant R(x) to be best. 

Chebyshev theorem. The presence of Chebyshev alternation points is a neces- 
sary and sufficient condition under which the approximant is best. Such an approx- 
imant exists and is unique if two fractions that coincide after cancellation are not 
regarded as different. 

A comparatively simple proof is given in [17]. Note that P. L. Chebyshev and 
Vallee-Poussin considered the case of polynomial approximants. The general case 
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was first considered by N. I. Akhiezer, the results mentioned being valid also in 
the case when the expression A p (x) = (f(x) — R(x))/p(x), where the weight p is 
nonzero, is taken for the error function; if the weight satisfies certain additional 
conditions, then the segment [A, B] need not be assumed finite [17]. Note, that for 
p{x) e 1 we obtain the absolute error (2); and if fix) has no zeros on the segment 
[A, B], then for p(x) = f(x) we shall obtain the relative error junction 

(6) S(x)= m = m^m =1 R ^ 



m fix) fix) • 

Correspondingly, the quantity 

(7) 5= max \5(x)\ 

is the relative error, and one can speak of the best approximants with respect to the 
relative error. 

Suppose that the segment [A, B] is symmetric with respect to zero, i.e., A = —B. 
If the function f(x) is even, then it is not difficult to verify that all its best rational 
approximants on this segment (in the sense of the absolute error or of the relative 
one) are also even functions, so that one can immediately look for them in the form 
i?(x 2 ) = Pix 2 )/Q{x 2 ), where P and Q are polynomials. If the function f(x) is odd, 
then its best approximants are also odd functions and one can immediately look for 
them in the form xR(x 2 ) = xP(x 2 )/Q(x 2 ), where P and Q are polynomials. One 
can speak of the best approximants with respect to the relative error, if an odd 
function f[x) is zero only for x — 0, is continuously differentiable, and /' (0) ^ 0. 
In this case f(x) can be represented in the form xip{x), where <p(x) is a continuous 
even function that never equals zero. Then describing rational approximants to the 
function f{x) with best relative error reduces to solving the same problem for the 
even function </?(x); indeed, 

f(x) - xR{x 2 ) _ xipix) - xRix 2 ) _ tp(x) - Rix 2 ) 
fix) xtpix) (fix) 

§3. Construction methods for best approximants 

Suppose that a rational approximant of the form (1) is the best approximant to 
a continuous function fix) on the segment [A, B]. For simplicity, further we shall 
assume that the defect is zero. Let x\, X2, ■ ■ ■ , x. m+n+ 2 be the Chebyshev alternation 
points. Then the error function A p (x) corresponding to the weight p (see above) 
satisfies the following system of equalities: 

(8) \ix k ) = (-l) fc A, 

where |A| = A p = max^^s |A p (x)|; k = 1, . . . ,m + n + 2. For fixed values of 
x\, . . . , x m+n+ 2, relations (8) can be regarded as a system of m + n + 2 equations 
with respect to the unknowns dj, bj, A, where i = 0, . . . , n, j = 0, . . . , m. Since one 
can multiply the numerator and the denominator of the fraction R(x) by the same 
number, we see that one more condition, for example, b = 1, can be added to sys- 
tem (8), so that the number of equations coincides with the number of unknowns. 
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The iteration method of computation of coefficients in the approximant R(x) (sug- 
gested by A. Ya. Remez (see [18]) for polynomial approximants and generalized to 
the general case) is based on this idea. Different versions of the generalized Remez 
method were considered in many papers; see, for example [3, 12, 20-27]. 

The approximant is constructed as follows. On the first step, the initial approxi- 
mations x\ < x 2 < ■ ■ ■ < Xm+n+2 to the Chebyshev alternation points are chosen on 
the segment [A, B] and the system of equations (8) is solved. As a result we obtain 
some rational approximant R\{x) with error function Ai(x) = (f(x) — R\{x))/ p{x). 
For this function the extremum points are found, and the information obtained is 
used to modify the set {x\, . . . ,x m+n+ i}. Then the procedure is repeated anew, a 
new approximant R2(x) is obtained, and so on. 

Taking into account the fact that A p (x) = (f(x) — R(x))/p(x) and R(x) has the 
form (1), system (8) can be rewritten in the form 

whence, as the result of elementary transformations, we get the system of equations 

n m 

(9) 5Ia 4 (x fe ) 4 + MA)5>,(x fc ) J =0, 

i=0 j=0 

where p, k {\) = {-l) k p{x k )\ - f(x k ), k = 1, 2, . . . , m + n + 2. 

Note that for a fixed value of A (as well as for the alternation points 
xi, . . . ,x m+n+2 ) the coefficients a^, bj of the approximant satisfy the system of 
linear homogeneous algebraic equations (9). But A must also be determined; this 
transforms (9) into a nonlinear system of equations which is rather difficult to 
solve. The case when it is necessary to find the polynomial approximant, i.e., the 
case m = 0, is an exception to what was just noted. In this case the system (9) 
becomes linear. 

The solution of nonlinear system of equations (9) is usually reduced to the iter- 
ated solution of systems of linear equations. The following method is comparatively 
popular (see, for example, [3, 12, 21, 22, 25, 28]) and was used to compile the well- 
known tables of rational approximants to elementary and special functions [3] . Let 
&o = 1 (normalization); then (9) takes the following form 

n m 

(9') Yl + ^ E + (-l)V(zfc)A = f(x k ). 

i=0 j=l 

Substituting a fixed number Ao for A in the nonlinear terms of system (9'), we get 
the linear system 

n m 

(10) J2ai(x k y + n k (X )J2b j (x k y + {-l) k p(x k )\ = f{x k ). 

i=0 j=l 

The iteration process is applied to the initial collection of values of the critical 
points {x k }, i.e., of initial approximations to the Chebyshev alternation points, 
and to the given value A . First, from (10) one determines the new value of A 
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and substitutes it for A in the nonlinear terms of equation (10); then the system of 
equations (10) is solved again, and the next value of A is determined, and so on. 
As a result a new value of A and the collection of the coefficients a^, bj are defined. 
The next step is to determine a new collection of critical points {xk} as extremum 
points of the error function for the approximant obtained on the previous step. 
Both steps form one cycle of an iteration process. The calculation is finished when 
the value of A with precision given in advance coincides in absolute value with the 
maximal value of the error function. A complete text of the corresponding Algol 
program is given in [22]. 

Unfortunately the iteration process described above can be nonconvergent even 
in the case when the initial approximation differs from the solution of the problem 
infinitcsimally; see [28]. For some versions of the Remez method it is proved that the 
iteration process converges if the initial approximation is sufficiently good, see [20, 
23-25, 29, 12]. Nevertheless, in each particular case it is often difficult to indicate 
a priori (i.e., before the start of calculations) the initial approximant that ensures 
the convergence of the iteration process, and for a given initial approximation it is 
difficult to verify whether the conditions which ensure the convergence are satisfied. 
One of the methods which is applied in practice is to construct, at first, the best 
polynomial approximant of degree m + n (in this case no difficulties arise); next, 
using the Chebyshev alternation points of this approximant as the initial collection 
of critical points for the iteration process one constructs the best approximant 
having the form of a polynomial of degree m + n — 1 divided by a linear function. 
Finally, in the same manner, the degree of the numerator is successively reduced 
and the degree of the denominator is successively raised till an approximant of the 
required form (1) is obtained, see [3, 12]. 

Together with iteration methods for constructing the best rational approximants, 
methods of linear and convex programming are used, see [18, 30]. Iteration methods, 
as a rule, are more efficient [27], but cannot be generalized directly to the case of 
functions of several variables. 

§4. The role of approximate methods and an 
estimate of the quality of approximation 

The construction algorithms for the best rational approximants are compara- 
tively complicated, so simpler methods that give an approximate solution of the 
problem are used on a large scale, see, for example, [1, 5, 7-9, 11-15, 24, 25, 31- 
37]. Below we describe methods which are easily implemented, use comparatively 
little computation time and yield approximants that are close to best. Such an ap- 
proximant can be used as an initial approximant for an iteration algorithm which 
gives the exact result. The approximant that is best in the sense of the absolute 
error is not necessary best in the sense of the relative error. It is usually important 
in practice for both the absolute error and the relative one to be small. So rather 
than the best approximants, the approximants constructed by means of an approx- 
imate method and having appropriate absolute and relative errors are often more 
convenient. Finally, one can also apply methods giving an approximate solution 
of the rational approximation problem to those cases when the information about 
an approximated function is incomplete (for example, there are known values of 
a function only for a finite number of the argument values, or there are known 
only the first terms of the function expansion in a series, or the initial information 
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contains an error, and so on). 

The generalized de la Vallee-Poussin theorem (see §2 above) allows to estimate 
the proximity of an approximate solution of the approximation problem to the best 
approximant even in the case when this best approximant is unknown. 

For example, suppose we want to estimate the proximity of a given approximant 
of the form (1) to an approximant of the same form with best absolute error to a 
given function f(x). Suppose for simplicity that the defect of the best approximant 
is zero (in practice this condition usually holds). Then, by virtue of Chebyshev's 
alternation theorem, in the case when the given approximant R(x) is sufficiently 
close to the best one, at the successive points x\ < ■ ■ ■ < x m+n+ 2 belonging to the 
interval where the argument x varies the absolute error function A (a;) takes the 
nonzero values Ai, — A 2 , . . . , (— l) m+ ™ +1 A m+ „ +2 having alternating signs. In this 
case we shall say that alternation appears. If |Ai| = | A2 1 = • • • = |A m+ „ + 2| 7 then 
this alternation is Chebyshev's. Denote by A m i n the best possible absolute error 
of approximants of the form (1) to the function f(x) (the numbers m and n are 
fixed). Suppose A = min{|Ai|, . . . , |A m +„+2|}- Then, due to the generalized de la 
Vallee-Poussin theorem, the inequality A m ; n ^ A is valid; thus 

(11) A > A min ^ A. 

It is clear that A coincides with the greatest (in absolute value) extremum of 
the function f(x), and one can take the least (in absolute value) extremum of this 
function for A (up to a sign). The quantity 

(12) q = A/A 

characterizes the proximity of the error of the given approximant to the error of the 
best approximant. It is clear that < q ^ 1 and q = 1 if the given approximant is 
best. The closer the quantity q to 1 the higher the approximant quality. From (11) 
and (12) it follows that 

(13) A min ^ q ■ A. 

Usually, the estimate (13) is rather rough. The appearance of the alternation itself 
indicates to the closeness of the error of the given approximant to the best one, and 
the quantity A m ; n /A is, in general, much greater than the value of q. 

Similarly, the quality of an approximant with respect to the best relative error 
is evaluated. 

If we can calculate the values of the approximated function for all the points 
of the segment [A, B] (or for a sufficiently "dense" set of such points), and if the 
coefficients of the rational approximant R(x) are already known, then it is not hard 
to determine the points of local extremum of the error function and to calculate the 
quantities Ai, A2, . . . , A m+n+ 2, and also the quantities A and q by means of a special 
standard subroutine. The same subroutine is also necessary for the construction of 
the best approximants by means of an iteration method. A good program package 
for the construction of rational approximants must contain a subroutine of this sort 
as well as a good subroutine for solving systems of linear algebraic equations and 
must have, as a component part, routines which implement both the algorithms for 
approximate solving the approximation problem and the construction algorithms 
for best approximants. 
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§5. Chebyshev polynomials and polynomial approximations 



Chcbyshev polynomials play an important role in approximation theory and in 
computational practice (see, for example, [12, 13, 17, 18, 24, 33, 38]). We shall 
consider Chebyshev polynomials of the first kind. 

These polynomials were defined by P. L. Chebyshev in the form 

(14) T n {x) = cos(narccosx), 

where n = 0, 1, ... . Assume that ip = arccosx; representing cos nip via sin ip and 
cos<£, it is not difficult to verify that the right-hand side of formula (14) coincides 
indeed with a certain polynomial. In particular, 

T (a;) = cosO = 1, 

T\{x) — cos ip — cos(arccosx) = x, 

T2(x) = cos2ip = cos 2 ip — sin 2 ip 

= T?(x)-(l-T 2 (x))=2x 2 -l 

and so on. For an actual computation of T n (x) the recurrence relation 

(15) T n (x) = 2xT n _ 1 (x) - T„_ 2 (x) 

is usually used. Sometimes it is more convenient to consider the polynomials 
T n (x) = 2~ n+1 T n (x) since the coefficient at x n of the polynomial T n {x) is equal 
to 1. The polynomials mentioned above satisfy the recurrence relation 

(16) T n ( X ) = xTn-^x) - ~Jn-2{x). 

Consider a particular case of the problem of the best approximation, the approxi- 
mant to the function f(x) = x n on the segment [—1, 1] being looked for in the form 
of a polynomial P(x) of degree n — 1. From the de la Vallee-Poussin theorem it 
follows that the approximant in question has the form P{x) = x n — T n (x). In this 
case the error function A(x) coincides with T n (x) and one can explicitly obtain the 
Chebyshev alternation points: Xk = — cos ^L, where fc = 0, 1, . . . , n. Indeed, 

T„(x fe ) = 2- rl+1 cosn(7r-— ) 

n ' 

(_i\n-k 

^2-"+ 1 cos(n-fc) 7 r = i^^, 

i.e., T n (x) takes its maximum value 1/2" -1 with alternate signs at the points 
indicated above. This implies an important consequence: the best polynomial 
approximant of degree n — 1 to the polynomial ao + a\x + ■ ■ ■ + a n x n on the segment 
[—1, 1] has the form a + a\x + • • • + a n x n — a n T n {x). This result allows to reduce 
the degree of a polynomial (for example, of some polynomial approximant) with 
a minimum loss of accuracy. The reduction of the polynomial degree by means 
of successively applying the method indicated above is called economization. The 
economization method is due to C. Lanczos, see [38]. 

10 



. , x m can be expressed via the Chebyshev polynomials 
To, 7i, . . . , T m . For m > the following formula is valid: 

[m/2] . . 

(17) x m = 2 1 -™ £ a fc [ m )T m _ 2k (x), 



k=0 



where (T) are the binomial coefficients, [m/2] is the integer part of the number 
to/2, afe = 1/2 for k = m/2 and aj, = 1 for fc 7^ m/2. The expansion of the 
polynomial T m in powers of x for m > is given by the formula 

Finally, a; = To = 1- It is clear that the set of polynomials of the form X)"=o c «^i 
where Cj arc numerical coefficients, coincides with the set of all polynomials of 
degree n. 

The economization procedure mentioned above can also be described in the 
following way. The initial polynomial Y^i=o a i x% can t> e represented by means of 
formula (17) in the form X)™=o c «^- The polynomial of degree k obtained as the 
result of economization coincides with J2i=o c «^- F° r functions represented in the 
form of power series f(x) = YliLo aiX% ^ * s not difficult to obtain, by means of 
the economization method, polynomial approximants on the segment [—1,1] close 
to the best ones. For this purpose it is necessary to replace f(x) by its truncated 
Taylor series at the point x = 0, i.e., by the polynomial J27=o aiX% approximating 
this function with a high degree of accuracy, and then to obtain, by means of the 
economization of this polynomial, the polynomial Yli=o c i^i °f tne gi yen degree k. 
As n — > 00, the quantity X)i=o c ^ tends to the sum of the first k + 1 terms of the 
expansion of f(x) into Fourier series with respect to Chebyshev polynomials. 

Denote by L? w the Hilbert space of square integrable (with respect to the measure 
w(x)dx) functions on the segment [—1,1]. Suppose w(x) = \J\ — x 2 . It is not 
hard to verify that the Chebyshev polynomials T n form an orthogonal (but not 
orthonormal) basis in L 2 W . The expansion 

00 

(19) f(x) = J2a T i 

of a function fix) into the series in Chebyshev polynomials (the Fourier-Chebyshev 
series) is easily reduced to the expansion of the function /(cos x) into the standard 
Fourier series in cosines. Among the polynomials of degree n, the polynomial 
Pn{x) — 5Z™_o CjTj gives the best approximation to the function f(x) in L 2 W . The 
following result shows that this approximant on the segment [—1,1] is close to the 
best one in the sense of the absolute error. 

Cheney theorem. Let ip(x) be a function integrable on the segment [—1, 1]. If for 
i = 0, 1, 2, . . . , k the equality 

1 

(20) J ip{x)Ti{x)w{x) dx = 

-1 
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is valid, then ip(x) either changes its sign in [— 1, 1] at least k + 1 times or vanishes 
almost everywhere. 



Proof. Assume that ip(x) has exactly to sign changes at the points x\ , . . . , x m , where 
^ to ^ k. Suppose P(x) = II™ 1 (a; — Xi); since P(x) is a polynomial of degree 
to, we see that it can be represented in the form of a linear combination of the 
polynomials T , T\, . . . , T m . Thus (20) implies J j ip(x)P(x)w(x) dx = 0. It is clear 
that the function ip(x)P(x) has no sign changes; thus the equality just obtained 
means that <p(x) vanishes almost everywhere. The latter proves the theorem. 

In a more general case, this result is proved in [39, p. 110]. The proof given 
above allows a generalization to the case of systems of orthogonal polynomials of 
a sufficiently general form and arbitrary segments of integration (including infinite 
ones), see [40]. 

Now let us return to the function (19) and to its approximant P n (x) = J27=o c i^i- 
The absolute error function 

oo 

A(x) = f(x) - P n (x) = ]T cm 

i—n+1 

is orthogonal to the polynomials T , Ti, . . . , T n , i.e., 



J A(x)T 4 (x)w(x)dx = 



for all i — 0, 1, . . . ,n. If the function f(x) is continuous, then the error function 
A(x) is also continuous. In this case from the Cheney theorem it follows that either 
A(x) is identically zero or has n + 1 sign changes. This means that alternation is 
present, i.e., the approximant P n (x) is close to the best one, and their proximity 
can be evaluated by means of relations (11) — (13). 

While the truncated Taylor series Y^i=o ^/^(O) 2 ^ & ves the best approxi- 
mant only in a neighborhood of the origin, the truncated Fourier-Chebyshev series 
Sr=o c iTi for the function f(x) with the same number of terms gives an approxi- 
mant which is close to best one on the entire segment [—1,1]. 

The change of the variable x i— » \[{B — A)x + A + B] reduces the problem of 
approximation on an arbitrary finite segment [A, B] to the case of the segment 
[—1,1]. Further, as a rule, we shall consider the latter case. 

§6. Ill-conditioned problems and rational approximations 

Let {ipo,ipi, . . . ,ip n } and {i/'o, ^lj ■ • ■ 1 4>m} be collections consisting of linearly in- 
dependent functions of the argument x belonging some (possibly multidimensional) 
set X. Consider the problem of constructing an approximant of the form 

, 21 n R , x _ a y + anpi H h a n Vn 

foo^o + Mi + • • • + b m ^ m 

to a given function f{x) defined on X. If X coincides with a real line segment [A, B] , 
ifk = x k and ipk = x h for all fc, then the expression (21) turns out to be a rational 
function of the form (1) (see the Introduction). It is clear that expression (21) also 
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gives a rational function in the case when we take Chebyshev polynomials or, 
for example, Legendre, Laguerre, Hermite, etc. polynomials as ipk and ipk- 

Fix an abstract construction method for an approximant of the form (21) and 
consider the problem of computing the coefficients aj, bj. Quite often this problem 
is ill-conditioned, i.e., small perturbations of the approximated function f(x) or a 
calculation errors lead to considerable errors in the values of coefficients. For exam- 
ple, the problem of computing coefficients for best rational approximants (including 
polynomial approximants) for high degrees of the numerator or the denominator is 
ill-conditioned. 

The instability with respect to the calculation error can be related both to the 
abstract construction method of approximation (i.e., with the formulation of the 
problem) and to the particular algorithm implementing the method. The fact that 
the problem of computing coefficients for the best approximant is ill-conditioned is 
related to the formulation of this problem. This is also valid for other construction 
methods for rational approximants with a sufficiently large number of coefficients. 
But an unfortunate choice of the algorithm implementing a certain method can 
aggravate troubles connected with ill-conditioning. 

Several construction methods for approximants of the form (21) are connected 
with solving systems of linear algebraic equations. This procedure can lead to a 
large error if the corresponding matrix is ill-conditioned. Consider an arbitrary 
system of linear algebraic equations 



(22) Ay = h, 

where A is a given square matrix of order N with components a^- (i, j = 1, . . . , N), 
his a given vector column with components hi, and y is an unknown vector column 
with components yi. Define the vector norm by the equality 



(23) imi=5Zn 

i=l 

(this norm is more convenient for calculations than \J x\ + ■ ■ ■ + x 2 N ). Then the 
matrix norm is determined by the equality 

N 

(24) p|| = max ||Ay|| - ^ax^Ki 

i— 1 

If a matrix A is nonsingular, then the quantity 



(25) c(m&{A) = \\A\\-\\A- 1 \\ 

is called the condition number of the matrix A (sec, for example, [41]). Since 
y = A~ x h, we see that the absolute error Ay of the vector y is connected with the 
absolute error of the vector h by the relation Ay = A~ 1 Ah } whence 

\\Ay\\ < WA-'W ■ \\Ah\\ 

and \\Ay\\/\\y\\^\\A-i\\-m/\\y\\)mh\\/\\h\\). 
13 



Taking into account the fact that \\h\\ < \\A\\ ■ \\y\\, we finally obtain 

(26) IIAyll/llylKPII.IIA^II-IIATOII, 

i.e., the relative error of the solution y is estimated via the relative error of the 
vector h by means of the condition number. It is clear that (26) can turn into an 
equality. Thus, if the condition number is of order 10 fe , then, because of round-off 
errors in h, we can lose k decimal digits of y. 

Similarly, the contribution of the error of the matrix A is evaluated. Finally, 
the dependence of cond(A) on the choice of a norm is weak. A method of rapid 
estimation of the condition number is described in [41, §3.2]. The analysis of the 
cases when the condition number gives a much too pessimistic error estimate is 
given in [42]. 

As an example, we note that the coefficients of the polynomial P n (x) which give 
the best approximant to the function f(x) in the metric of the Hilbert space L? w 
(see §5 above) can be determined from the system of equations 

l 

(27) J (f(x) - P n (x))x k w(x) dx = 0, 



where k = 0, 1, . . . , n. With respect to coefficients of the polynomial P n {x) (in pow- 
ers of x or in Chebyshev polynomials) these equations are linear and algebraic. But 
due to the fact that the monomials x k are "almost linearly dependent" , system (27) 
is very ill-conditioned. The equivalent system 

l 

(27') J (f(x) - P n {x))T k (x)w(x) dx = 



is better conditioned, but in this case it is also preferable to use the economization 
procedure or to determine the coefficients c, in (19) by formulas 

l 

c = ^ J f(x)w(x) dx, 

-l 
i 

Ci = ^ J f(x)Ti(x)w(x) dx, 



(28) 



i > 1. 



We recall that here w(x) = 1/vl — x 2 . 

§7. The effect of error autocorrection 

Fix an abstract construction method (problem) for an approximant of the form 
(21) to the function f(x). Let the coefficients dj, bj give an exact or an approximate 
solution of this problem, and let the en, bj give another approximate solution ob- 
tained in the same way. Denote by Aotj, Abj the absolute errors of the coefficients, 
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i.e., Aai — a, — a^, A6j = 6j — bj; these errors arise due to perturbations of the 
approximated function f(x) or due to calculation errors. Set 

n m 
n m 

AP(x) = Y AaiPh AQ(x) = Y Abjipj, 

i=0 j=0 

P{x) = P + AP, Q(x) =Q + AQ. 

It is easy to verify that the following exact equality is valid: 



(29) 



AP P _ AQ ( AP 



Q + AQ Q Q yAQ Q J ' 



As mentioned in the Introduction, the fact that the problem of calculating co- 
efficients is ill-conditioned can nevertheless be accompanied by high accuracy of 
the approximants obtained. This means that the approximants P/Q and P/Q 
are close to the approximated function and, therefore, are close to each other, al- 
though the coefficients of these approximants differ greatly. In this case the relation 
AQ/Q = AQ/ (Q + AQ) of the denominator considerably exceeds in absolute value 
the left-hand side of equality (29). This is possible only in the case when the differ- 
ence AP/ AQ — P/Q is small, i.e., the function AP/ AQ is close to P/Q, and, hence, 
to the approximated function. Thus the function AP/ AQ will be called the error 
approximant. For a special case, this concept was actually introduced in [5] . In the 
sequel, we shall see that in many cases the error approximant provides indeed a 
good approximation for the approximated function, and, thus, P/Q and P/Q differ 
from each other by a product of small quantities in the right-hand side of (29). The 
thing is that the errors Aa,, Abj are not arbitrary, but are connected by certain 
relations. 

Let an abstract construction method for the approximant of the form (21) be 
linear in the sense that the coefficients of the approximant can be determined from 
a homogeneous system of linear algebraic equations. The homogeneity condition is 
connected with the fact that, when multiplying the numerator and the denomina- 
tor of fraction (21) by the same nonzero number, the approximant (21) does not 
change. Denote by y the vector whose components are the coefficients ao, ai, . . . , a n , 
bo, bi, . . . , b m . Assume that the coefficients can be obtained from the homogeneous 
system of equations 



(30) Hy = 0, 

where H is a matrix of dimension (m + n + 2) x (m + n + 1). 

The vector y is an approximate solution of system (30) if the quantity ||-ffy|| 
is small. If y and y are approximate solutions of system (30), then the vector 
Ay = y — y is also an approximate solution of this system since \\H Ay\\ — \\Hy — 
Hy\\ < \\Hy\\ + \\Hy\\. Thus it is natural to assume that the function AP/AQ 
corresponding to the solution Ay is an approximant to f(x). It is clear that the 
order of the residual of the approximate solution Ay of system (30), i.e., of the 
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quantity ||i/A?/||, coincides with the order of the largest of the residuals of the 
approximate solutions y and y. For a fixed order of the residual the increase of the 
error Ay is compensated by the fact that Ay satisfies the system of equations (30) 
with greater "relative" accuracy and the latter, generally speaking, leads to the 
increase of accuracy of the error approximant. 

To obtain a certain solution of system (30), one usually adds to this system a 
normalization condition of the form 

n m 

(31) Xidj + Hjbj = i, 

where Aj, /ij are numerical coefficients. As a rule, the equality bo = 1 is taken 
as the normalization condition (but this is not always successful with respect to 
minimizing the calculation errors). 

Adding equation (31) to system (30), we obtain a nonhomogcneous system of 
m + n + 2 linear algebraic equations of type (22). If the approximate solutions y and 
y of system (30) satisfy condition (31), then the vector Ay satisfies the condition 

n m 

(31') 5^AiAoi + 5^/ijA6j = 0, 

i=0 j=0 

It is clear that the above reasoning is not rigorous; for each specific construction 
method for approximations it is necessary to carry out some additional analysis. 
More accurate reasoning is given below, in §8, for the classical Pade approximants, 
and in §14, for the linear and nonlinear Pade-Chebyshev approximants. The pres- 
ence of the error autocorrection mechanism described above is also verified by a 
numerical experiment (see below). 

The effect of error autocorrection reveals itself for certain nonlinear construction 
methods for rational approximations as well. One of these methods is considered 
below, in §12-14 (nonlinear Pade-Chebyshev approximation). 

It must be emphasized that (as noted in §3) the coefficients of the best Chebyshev 
approximant satisfy the system of linear algebraic equations (9) and are computed 
as approximate solutions of this system on the last step of the iteration process 
in algorithms of Remez's type. Thus, the construction methods for the best ratio- 
nal approximants can be regarded as linear. At least for some functions (say, for 
cos7r/4a;, — 1 ^ x ^ 1) the linear and the nonlinear Pade Chebyshev approximants 
are very close to the best ones in the sense of the relative and the absolute errors, 
respectively. The results that arise when applying calculation algorithms for Pade- 
Chebyshev approximants can be regarded as approximate solutions of system (9) 
which determines the best approximants. Thus the presence of the effect of error 
autocorrection for Pade-Chebyshev approximants gives an additional argument in 
favor of the conjecture that this effect also takes place for the best approximants. 

Finally, note that the basic relation (29) becomes meaningless if one seeks an 
approximant in the form a <f + anpi + • • • + a n (p n , i.e., the denominator in (21) 
is reduced to 1. However, in this case the effect of error autocorrection (although 
much weakened) is also possible; this is connected with the fact that the errors Aai 
approximately satisfy certain relations. Such a situation can arise when using the 
least squares method. 
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§8. Pade approximations 



Let the expansion of a function f(x) into a power series (the Taylor series at 
zero) be given, i.e., 

oo 

(32) /(a0 = 5>z\ 

i=0 

The classical Pade approximant for f(x) is a rational function of the form 



(33) R(x) = P n (x)/Q m (x), 

where P n {x) and Q m (x) are polynomials of degree n and to, respectively, satisfying 
the relation 

(34) Q m (x)f(x) - P n {x) = 0(x m+n+1 ). 
Let 

P n (x) = a + aix H ha„a;™, 

(35) 

Qm(i) = &o + M + --- + & m a; m 



If 6o ^ 0, then (34) means that 

(34') f{x) - Pn(x)/Q m (x) = 0(x m+n+1 ), 



i.e., the first to + n + 1 terms of the Taylor expansion in powers of x (to x m+ ™ 
inclusive) of fix) and i?(x) are the same. The Pade approximation gives the best 
approximant in a small neighborhood of zero; it is a natural generalization of the 
expansion of functions into Taylor series and is closely connected with the expansion 
of functions into continued fractions. Numerous papers are devoted to the Pade 
approximation; sec, for example, [11-16, 5, 6]. 

One can evaluate the coefficients bj in the denominator of fraction (33) by solving 
the homogeneous system of linear equations 



(36) 



c n+k+jbj — —boC n+ k, 



3 = 1 



where k = 1, . . . , to and q = for I < 0. One can take any nonzero constant as bo. 
The coefficients ai are given by the formulas 

i i 

(37) a,i = ^2 b k Ci-k = ^ b t~kCk- 

k=0 k=0 

The text of the corresponding Fortran program is given in [11]. 

For large to the system (36) is ill-conditioned. Moreover, the problem of com- 
putation for coefficients of Pade approximants is also ill-conditioned independent 
of a particular solving algorithm for this problem, see [6, 43, 44]. In Y. L. Luke's 
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paper [5] the following reasoning is given. Let Aa,, Abj be the errors in the coeffi- 
cients di, bj which arise when numerically solving system (36). We shall ignore the 
errors of the quantities Cj and x and we shall consider that, according to (37), the 
errors in the coefficients a, have the form 

i 

(37') Aai = J2 Ab i-kCk- 

k=0 

From (37') it follows that 

n 

Adix i 

i=0 

n i 

^2^2Abi- k c k x l , 

i=o k=a 

the latter, after the change of indices, yields the relation similar to (34): 

AQ{x)f{x) - AP(x) = 0(x n+1 ). 

Thus, there are reasons to expect that the error approximant approximates indeed 
the function f(x) and the effect of error autocorrection takes place. In [5] the cor- 
responding experimental data for the function e~ x for x — 2, m = n = 6, 7, . . . , 14, 
and for x — 5 are given and the experiments with the functions x^ 1 ln(l + x), 
(1 + a;)* 1 / 2 , xe x t~ 1 e~ t dt are briefly described; see also [6]. 

A natural generalization of the classical Pade approximant is the multipoint Pade 
approximant (or Pade approximant of the second kind), i.e., a rational function of 
the form (33) whose values coincide with values of the approximated function f(x) 
at some points Xi (i — l,2,...,ra + n + l). This definition is extended to the case of 
multiple points, and for Xi = for all i it leads to the classical Pade approximations 
see [11, 14, 15]. The calculation of coefficients in the multipoint Pade approximant 
can be reduced to solving a system of linear equations, and there are reasons to 
suppose that in this case the effect of error autocorrection takes place as well. 

§9. Linear Pade-Chebyshev approximations and the PADE program 

Consider the approximant of the form (33) to the function f(x) on the segment 
[— 1 , 1] . The absolute error function of this approximant has the following form: 

A(a:) - $(i)/Q m (i), 

where 



m oo 

fAQ -AP = J2 Ab i xJ Cixi ~ 

j=0 i=0 
m co 



(38) $(a;) = f(x)Q m (x) - P n (x). 

The function R m ,n{x) = P n (x)/Q m (x) is called the linear Pade-Chebyshev approx- 
imant to the function f(x) if 

l 

(39) J <S>{x)T k {x)w{x) dx = 0, k = 0, 1, . . . , m + n, 

-l 
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where Tk(x) are the Chebyshev polynomials, w(x) = l/Vl — x 2 . This concept (in 
a different form) was introduced in [45] and allows a generalization to the case 
of other orthogonal polynomials (see [11, 33, 34, 39, 40]). Approximants of this 
kind always exist [39]. Reasoning in the same way as in §5 and applying Cheney's 
theorem, we can find out why the linear Pade-Chebyshev approximants are close 
to the best ones. 

Let P n (x) and Q m (y) be represented in the form (35). Then the system of 
equations (39) is equivalent to the following system of linear algebraic equations 
with respect to the coefficients a, , bj : 

( 4o) y b /[^0m dx .y ai j^K dx = o . 

The homogeneous system (40) can be transformed into a nonhomogeneous one 
by adding a normalization condition; in particular, any of the following equalities 
can be taken as this condition: 



(41) b a = 1, 

(42) 6 m =l, 

(43) a m = 1. 

In [1, 9] the program PADE (in Fortran, with double precision) which allows to 
construct rational approximants by solving the system of equations of type (40) is 
briefly described. The complete text of a certain version of this program and its 
detailed description can be found in the Collection of algorithms and programs of 
the Research Computer Center of the Russian Acad. Sci [7]. For even functions 
the approximant is looked for in the form 

(44) R{x) _a + a lX > + ... + a n ( X >r 



ba + hx 2 + --- + b m (x 2 ) m ' 
and for odd functions it is looked for in the form 

a + aix 2 H h a n (x 2 ) n 



(45) R{x) = x 



bQ + b lX 2 + --- + b m (x 2 ) m '' 



respectively. The program computes the values of coefficients of the approximant, 
the absolute and the relative errors, and gives the information which allows to 
estimate the quality of the approximation (see §4 above). In particular, a version 
of the PADE program is implemented by means of minicomputer of SM-4 class 
constructs the error curve, determines the presence of alternation, and produces 
the estimate of the quality of the approximation by means of quantity (12). Using 
a subroutine, the user introduces the function defined by means of any algorithm 
on an arbitrary segment [A, B], introduces the boundary points of this segment, the 
numbers m and n, and the number of control parameters. In particular, one can 
choose the normalization condition of type (41)-(43), look for an approximant in the 
form (44) or (45) and so on. The change of the variable reduces the approximation 
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on any segment [A, B] to the approximation on the segment [—1,1]. Therefore, we 
shall consider the case when A = — 1, £> = 1 in the sequel unless otherwise stated. 

For the calculation of integrals, the Gauss-Hermite-Chebyshev quadrature for- 
mula is used: 



i s 

f <p(x) n xr f 2i - 1 \ 

/ , ax — — } ip\ cos — - — 7r , 

_i *-■ i 



where s is the number of interpolation points; for polynomials of degree 2s — 1 
this formula is exact, so that the precision of formula (46) increases rapidly as 
the parameter s increases and depends on the quality of the approximation of the 
function ip(s) by polynomials. To calculate the values of Chebyshev polynomials, 
recurrence relation (15) is applied. 

If the function f(x) is even and an approximant is looked for the form (44), then 
system (40) is transformed into the following system of equations: 



where k = 0, 1, . . . , m + n. If f(x) is an odd function and an approximant is looked 
for in the form (45), then, first, by means of the solution of system (47) comple- 
mented by one of the normalization conditions, one determines an approximant of 
the form (44) to the even function f(x)/x, and then the obtained approximant is 
multiplied by x. This procedure allows to avoid a large relative error for x = 0. 

The possibilities of the PADE program are demonstrated in Table 1. This table 
contains errors of certain approximants obtained by means of this program. For 
every approximant, the absolute error A, the relative error 6, and (for comparison) 
the best possible relative error <5 m j n taken from [3] are indicated. The function y/x is 
approximated on the segment [1/2, 1] by the expression of the form (1), the function 
cos jX is approximated on the segment [—1, 1] by the expression of the form (44), 
and all the others are approximated on the same segment by the expression of the 
form (45). 

Table 1 
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Function 


m 


n 




A 






5 




^min 




/ — 

y/X 


Z 


2 


A D 

0.8 • 


1 a — 

10 


6 


1.13 


• 10 


-6 


0.6 • 


10" 


6 


rz 


Q 
O 


Q 
O 


1 A 

i.y 


1 A — 
1U 


9 


Z. 1 


1 A- 
1U 


9 


1.12 


• 10 


-9 


7T 

COS -gX 


A 
U 


q 
o 


U.zo 


1 ft 

• iu 


-7 


u.oy 


1 ft 

• 1U 


-7 


0.32 


• 10 


-7 


71 

COS -gX 


1 
1 


o 
Z 


U.z4 


• 1U 


-7 


U.o4 


• 1U 


-7 


0.29 


• 10 


-7 


cos gX 


z 


z 


u.oy 


1 n- 

1U 


-10 


n o/i 
u.»4 


1 n- 

1U 


-10 


0.79- 


10" 


10 


TT 

COS -^x 


A 
U 





n K7 

V.O 1 


1 n- 

1U 


-13 


u. < y 


1 n- 

1U 


-13 


0.66- 


10" 


13 


TT 

COS gX 


z 


q 
o 


U.4 • 


1U 


13 


U.OD 


1 n- 

1U 


-13 


0.46- 


10" 


-13 


TT 

sin jX 


A 
U 


/I 
4 


U.o4 


1 n~ 

1U 


-11 


U.40 


1 n _ 

1U 


-11 


0.47- 


10" 


11 


sin fa; 


z 


o 
Z 


U.oz 


1U 


-11 


U.40 


1U 


-11 


0.44- 


10" 


11 


sin fa; 


A 

u 





U.oO 


1U 


-14 


U.OD 


1U 


-14 


0.45- 


10" 


14 


sin fa; 


1 
1 


1 


U.14 


1 n 
• 1U 


-3 


0.14 


• 10 


-3 


0.12 


• 10 


-3 


sin fa; 


A 



4 


A 

0.67 


1 A 

• 10 


-8 


0.67 


• 10 


-8 


0.54 


• 10 


-8 


sin fa; 


2 


2 


0.63 


• 10 


-8 


0.63 


• 10 


-8 


0.53 


• 10 


-8 


sin fa; 


3 


3 


0.63 


10" 


- 13 


0.63 


10" 


- 13 


0.5 • 


io- 


13 


tgjx 


1 


1 


0.64 


• 10 


— 5 


0.64 


• 10 


— 


0.57 
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The PADE program is comparatively simple and compact; it includes the stan- 
dard subroutine DGELG for solving systems of linear algebraic equations (this 
subroutine is taken from [46]) and a subroutine of numerical integration, and also 
a number of service, test and auxiliary modules. No additional software is used. 
The program needs minimum hardware requirements and can be implemented by 
means of any computer having a Fortran compiler, random-access memory of suf- 
ficient volume, a printer or a display. 

The version of the PADE program described in [7] is implemented by means 
of computers of IBM 360/370 class and requires 60 K bytes of main memory; the 
volume of this program in Fortran (including comments) is 581 lines (cards). The 
program execution time depends on the type of the computer, on the approximated 
function, and on the values of control parameters. For example, the CPU time for 
determining, by means of the PADE program, an approximant of the form (1) to 
the function ^fx on the segment [1/2, cc] for m = n = 2 is 4.4s. In this case the 
normalization (43) is applied, and the number of checkpoints used while estimating 
the error is 1200; the compilation time is not taken into account 1 . 

One of the versions of the program gives the estimate of the quality of the 
approximant obtained according to formula (12) (see §4 above). For example, for 

1 Onc can sufficiently decrease the number of checkpoint without considerable loss of accuracy 
of error estimation (in the present case, for example, to 200 points). 
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the function sin ^x for m = n = 2, and for the relative error we have q — 0.0625 
whence it follows that 5 m i n ^ qS w 0.4 • 10~ 9 . This estimate is rough and in fact, as 
is shown in Table 1, <5 m i n /<> ~ 0.84. For the absolute error the program gives in this 
case q = 0.71. The latter indicates to the closeness of this error to the best possible. 
The version of the program mentioned above allows to carry out the calculations in 
interactive mode varying the degrees m and n, the boundary points of the segment 
[A, B], the branches of the algorithm, the number of checkpoints when the errors 
are calculated, the number of interpolation points in the quadrature formula (46), 
and to estimate rapidly the quality of the approximation according to the error 
curve. 

Remark. The program of constructing classical Pade approximants given in [11] 
is also called PADE, but, of course, here and in [11] different programs are discussed. 

§10. The PADE program. Analysis of the algorithm 

The quality of an approximant obtained by means of the PADE program mainly 
depends on the behavior of the denominator of this approximant and on the cal- 
culation errors. The fact that the corresponding systems of algebraic equations 
are ill-conditioned is the most unpleasant the source of errors of the method under 
consideration. Seemingly, the methods of this kind are not widely used due to this 
reason. 

The condition numbers of systems of equations that arise while calculating, by 
means of the PADE program, the approximants considered above are also very 
large, for example, while calculating the approximant of the form (5) on the segment 
[—1, 1] to sin for m = n = 3, the corresponding condition number is of order 10 13 . 
As a result, the coefficients of the approximant are determined with a large error. 
In particular, a small perturbation of the system of linear equations arising when 
passing from computer ICL 4-50 to ES-1045 (because of the calculation errors) 
gives rise to large perturbations in the coefficients of the approximant. Fortunately, 
the effect of error autocorrection (see §7 above) improves the situation, and the 
errors of the approximant have no substantial changes under this perturbation. 
This fact is described in the Introduction, where concrete examples are also given. 

Consider some more examples connected with passing from ICL 4-50 to ES-1045. 
The branch of the algorithm which corresponds to the normalization condition (41) 
(i.e., to b = 1) is considered. For arctgx the calculation of an approximant of the 
form (45) on the segment [—1,1] for m = n = 5 by means of ICL-4-50 computer 
gives an approximation with the absolute error A = 0.35 • 10~ 12 and the relative 
error S = 0.16 • 10 . The corresponding system of linear algebraic equations has 
the condition number of order 10 30 ! Passing to ES-1045 we obtain the following: 
A = 0.5 • 10~ 14 , S = 0.16 • 10~ 12 , the condition number is of order 10 14 , and 
the errors Aa± and Ab\ in the coefficients a\ and b\ in (45) are greater in absolute 
value than l! This example shows that the problem of computing condition number 
of an ill-conditioned system is, in its turn, ill-conditioned. Indeed, the condition 
number is, roughly speaking, determined by values of coefficients of the inverse 
matrix (see §6 above, eqs (24) and (25)), every column of the inverse matrix being 
the solution of the system of equations with the initial matrix of coefficients, i.e., 
of an ill-conditioned system. 

Consider in more detail the effect of error autocorrection for the approximant 
of the form (44) on the segment [—1,1] to the function cos ^xforr7i = 2,n = 3. 
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Constructing this approximant both on the ICL-4-50 and the ES-1045 computer 
results in the approximation with the absolute error A = 0.4 • 10" 13 and the rela- 
tive error 5 = 0.55 • 10~ 13 which are close to the best possible. In both the cases 
the condition number is of order 10 9 . The coefficients of the approximants ob- 
tained by means of the computers mentioned above and the coefficients of the error 
approximant (see §7 above) are as follows: 



&0 = 


0.9999999999999600, 








do = 


0.9999999999999610, 


Aa 


= -10" 


15 

•> 




— 2925310453579570 








ai = 


-0.2925311264716216, 


Aai 


= io- 7 


•0.811136646, 


a 2 = 


10" 1 • 0.1I052542547I6866, 








CL2 = 


10" 1 • 0.1 105256585556549, 


Aa 2 


= -10" 


7 • 0.2330839683, 


S3 = 


10~ 3 • 0.I04947450090440I , 








a-3 = 


10~ 3 • 0.1049482094850086, 


Aa 3 


= 10- 9 


• 0.7593947685, 


bo = 


1, 








bo = 


1, 


Ab 


= 0, 




h = 


10" 1 • 0.1 58940921 7324021 , 












A61 


= io- 7 


•0.8111363684, 


bi = 


10" 1 • 0.1 58940II05960337, 






b 2 = 


10~ 3 • 0.I0033590II092697, 


Ab 2 


= 10- 8 


•0.17093009168. 


b 2 = 


10~ 3 • 0.I00334I9I8083529, 








, the 


error approximant has the form 









(48) 



AP 
AQ 



Aa + Aaix 2 



Aa 2 x A + Aa 3 x 6 



Abix 2 + Ab 2 x 4 



If the relatively small quantity Aa = — 10~ 15 in (48) is omitted, then, as testing by 
means of a computer shows (2000 checkpoints), this expression is an approximant 
to the function cos fx on the segment [—1,1] with the absolute and the relative 
errors A = S = 0.22 • 10~ 6 . 

But the polynomial AQ is zero at x = 0, and the polynomial AP takes a small, 
but nonzero value at x = 0. Fortunately, equality (29) can be rewritten in the 
following way: 



(49) 



P P 

q Q 



AP 



AQ P 
Q Q' 



Thus, as AQ — > 0, the effect of error autocorrection arises because the quantity AP 
is close to zero, and the error of the approximant P/Q is determined by the error of 
the coefficient a . The same situation also take place when the polynomial AQ van- 
ishes at an arbitrary point xq belonging to the segment [A, B] where the function is 
approximated. It is clear that if one chooses the standard normalization (60 = 1), 
then the error approximant has actually two coefficients less than the initial one. 
Relations (38) and (39) show that in the general case the normalization conditions 
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a n = 1 or b m = 1 result in the following: the coefficients of the error approximant 
form an approximate solution of the homogeneous system of linear algebraic equa- 
tions whose exact solution determines the Pade-Chebyshev approximant having 
one coefficient less than the initial one. The effect of error autocorrection improves 
again the accuracy of this error approximant; thus, "the snake bites its own tail". 
A situation also arises in the case when the approximant of the form (44) to an 
even function is constructed by solving the system of equations (47). 

Sometimes it is possible to decrease the error of the approximant by means of 
the fortunate choice of the normalization condition. As an example, consider the 
approximation of the function e x on the segment [—1,1] by rational functions of the 
form (1) for m = 15, n = 0. For the traditionally accepted normalization bo = 1, 
the PADE program yields an approximant with the absolute error A = 1.4 • 1CP 14 
and the relative error S = 0.53 • 1CP 14 . After passing to the normalization condition 
&15 = 1, the errors are reduced nearly one half: A = 0.73 • 10~ 14 , S = 0.27 • 10~ 14 . 
Note that the condition number increases: in the first case it is 2 • 10 6 , and in the 
second case it is • 10 16 . Thus the error decreases notwithstanding the fact that 
the system of equations becomes drastically ill-conditioned. This example shows 
that the increase of accuracy of the error approximant can be accompanied by the 
increase of the condition number, and, as experiments show, by the increase of 
errors of the numerator and the denominator of the approximant. The fortunate 
choice of the normalization condition depends on the particular situation. 

A specific situation arises when the degree of the numerator (or of the denomina- 
tor) of the approximant is equal to zero. In this case the unfortunate choice of the 
normalization condition results in the following: the error approximant becomes 
zero or is not well-defined. For n — it is expedient to choose condition (42), as 
it was done in the example given above. For m — (the case of the polynomial 
approximation) it is usually expedient to choose condition (43). Otherwise the sit- 
uation will be reduced to solving the system of equations (27') in the case described 
in §6 above. 

Since the double precision regime of ES-1045 corresponds to 16 decimal digits 
of mantissa in the computer representation of numbers, while running computers 
of this type it makes sense to vary the normalization condition only in case the 
condition number exceeds 5 ■ 10 16 , where 5 is the relative error of the obtained 
approximant. The value of the condition number of the corresponding system of 
linear algebraic equations is given by the PADE program simultaneously with other 
computation results. 

The theoretical error of the method is determined, to a considerable extent, by 
the behavior of the approximant's denominator. It is convenient for the analysis, 
by dividing the numerator and the denominator of the fraction by bo to equate bo to 
1. If the coefficients b\, 62, . . . , b m are small in comparison with bo = 1, which often 
happens in computation practice, then the absolute error A(x) and its numerator 
$(x) = f(x)Q(x) — P(x) are of the same order, so that the minimization of $(x) 
leads to the minimization of the error A(x), see §9 above. Note that the coefficients 
of approximant (45) to the function arctg x on the segment [—1,1] are not small in 
comparison with bo- For example, for m = n = 3 the coefficient b\ is almost one 
and half times greater than the coefficient bo- Thus, as shown in Table 1, the errors 
of the approximant to arctg x obtained by means of the PADE program are several 
times greater than the errors of the best approximants. 

Note that sometimes it is possible to improve the denominator of the approxi- 
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mant or to reduce the condition number of the corresponding system of equations 
by extending the segment [A, B] where the function is approximated. Such an ef- 
fect is observed, for example, when approximants to some hyperbolic functions are 
calculated. 

Note that the replacement of the standard subroutine DGELG for solving system 
of linear algebraic equations by another subroutine of the same kind (for example, 
by the DECOMP program from [41]) does not essentially affect the quality of 
approximants obtained by means of the PADE program. 

One could seek the numerator and the denominator of the approximant in the 
form 

n 

i=0 
m 

3=0 

where T t are the Chebyshev polynomials. In this case the system of linear equations 
determining the coefficients would be better conditioned. But the calculation of 
the polynomials of the form (50) by, for example, the Chenshaw method, results in 
lengthening the computation time, although it has a favorable effect upon the error 
of calculations, see [47, Chapter IV, §9]. The transformation of the polynomials 
P and Q from the form (50) into the standard form (35) also requires additional 
efforts. 

In practice it is more convenient to use approximants represented in the form (1), 
(44), or (45), and calculate the fraction's numerator and denominator according the 
Horner scheme. In this case the normalization a n — 1 or b m = 1 allows to reduce 
the number of multiplications. Thus the PADE program gives coefficients of the 
approximant in the two forms: with the condition bo = 1 and with one of the 
conditions a n = 1 or — 1 rio matter which one of the conditions (41)— (43) is 
actually used while solving the system of equations of type (39) or (40). 

The PADE program (and the corresponding algorithm) can be easily modified, 
for example, to take into account the case when some coefficients are fixed before- 
hand. One can vary the systems of equations under consideration by changing the 
weight w(x), the interval where the functions are approximated, and the system 
of orthogonal polynomials. By a certain increase in complexity of the system of 
equations (40) it is possible to minimize the norm of the numerator <!>(x) of the 
error function A(x) in the Hilbert space l? w (see §5 above). 

The use of the PADE program does not require that the approximated function 
be expanded into a series or a continued fraction beforehand. Equations (39) or (40) 
and the quadrature formula (46) show that the PADE program uses only the values 
of the approximated function f(x) at the interpolation points of the quadrature 
formula (which are zeros of some Chebyshev polynomial). 

On the segment [—1,1] the linear Pade-Chebyshev approximants give a consid- 
erably smaller error than the classical Pade approximants. For example, the Pade 
approximant of the form (1) to the function e x for m = n = 2 has the absolute error 
A(l) = 4- 10~ 3 at the point x = 1, but the PADE program gives an approximant of 
the same form with the absolute error A = 1.9 • 10~ 4 (on the entire the segment), 
i.e., the latter is 20 times smaller than the previous one. The absolute error of the 
best approximant is 0.87 • 10~ 4 . 
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§11. The "cross-multiplied" linear 
Pade-Chebyshev approximation scheme 

As a rule, linear Pade-Chebyshev approximants are constructed according to 
the following scheme [45, 3, 11, 12]. Let the approximated function be decomposed 
into the series in Chcbyshev polynomials 

oo 

(51) f(x) = V' c.T^x) - \c + cxT^x) + c 2 T 2 (x) + ..., 

where the notation Y^'T=o Ui means that the first term u in the sum is replaced by 
Uq/2. The rational approximant is looked for in the form 

n 

RW = ^ ; 

E'bjTjix) 

3=0 

the coefficients bj are determined by means of the system of linear algebraic equa- 
tions 

m 

(53) 2^ bj(c i+j + c\i_j\) = 0, i = n + 1, . . . ,n + m, 

1=0 

and the coefficients at are determined by the equalities 

m 

(54) a i = -\b j (c i+j +c\ i _ j {)=Q, i = 0, 1, . . . ,n. 

j=o 

It is not difficult to verify that this algorithm must lead to the same results as the 
algorithm described in §9 if the calculation errors are not taken into account. 

The coefficients cj, for k = 0, 1, . . . ,n + 2m, are present in (53) and (54), i.e., 
it is necessary to have the first n + 2m + 1 terms of series (51). The coefficients 
Cfe are known, as a rule, only approximately. To determine them one can take the 
truncated expansion of f(x) into the series in powers of x (the Taylor series) and 
by means of the economization procedure transform it into the form 

n+2m 

(55) 5 ^( x )' 

i=0 

§12. Nonlinear Pade-Chebyshev approximations 

A rational function R(x) of the form (1) or (52) is called a nonlinear Fade 
Chebyshev approximant to the function f(x) on the segment [—1, 1], if 

l 

(56) J (f(x) - R{x))T k {x)w{x) dx = 0, k = 0, 1, . . . , m + n, 
-l 
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(52) 



where Tfc(x) are the Chebyshev polynomials, w(x) = \j\J\ — x 1 . Cheney's theorem 
(see §5 above) shows that the absolute error function A(x) = f(x) — R{x) has 
alternation. Thus, there are reasons to assume that the nonlinear Pade-Chebyshev 
approximants are close to the best ones in the sense of the absolute error. 

In the paper [32] the following algorithm of computing the coefficients of the 
approximant indicated above is given. Let the approximated function f(x) be ex- 
panded into series (51) in Chebyshev polynomials. Determine the auxiliary quan- 
tities ji from the system of linear algebraic equations 

m 

(57) lj c \k-j\ — 0, k = n + l,n + 2, . . . ,n + m, 

1=0 

assuming that 70 = 1. The coefficients of the denominator in expression (52) are 
determined by the equalities 

m-j 

*=o 

where = 1/2 Y^i=i if'i this implies 60 = 2. Finally the coefficients of the 
numerator are determined by formula (54). It is possible to solve system (57) 
explicitly and to indicate the formulas for computing the quantities ji. One can 
also estimate explicitly the absolute error of the approximant. This algorithm is 
described in detail in the book [33]; see also [11]. 

In contrast to the linear Pade-Chebyshev approximants, the nonlinear approx- 
imants of this type do not always exist, but it is possible to indicate explicitly 
verifiable conditions guaranteeing the existence of such approximants [33]. The 
nonlinear Pade-Chebyshev approximants (in comparison with the linear ones) have, 
as a rule, a somewhat smaller absolute errors, but can have larger relative errors. 
Consider, as an example, the approximant of the form (1) or (52) to the function 
e x on the segment [— 1, 1] for m = n = 3. In this case the absolute error for a 
nonlinear Pade-Chebyshev approximant is A = 0.258 • 10~ 6 , and the relative error, 
S = 0.252 • 10~ 6 ; for the linear Pade-Chebyshev approximant A = 0.33 • 10~ 6 and 
5 = 0.20- 10- 6 . 

§13. Applications of the computer algebra system 
reduce to the construction of rational approximants 

The computer algebra system REDUCE [48, 49] allows to handle formulas at 
symbolic level and is a convenient tool for the implementation of algorithms of 
computing rational approximants. The use of this system allows to bypass the 
procedure of working out the algorithm of computing the approximated function 
if this function is presented in analytical form or when either the Taylor series 
coefficients are known or are determined analytically from a differential equation. 
The round-off errors can be eliminated by using the exact arithmetic of rational 
numbers represented in the form of ratios of integers. 

Within the framework of the REDUCE system, the program package for en- 
hanced precision computations and construction of rational approximants is imple- 
mented; see, for example [8]. In particular, the algorithms from §11 and §12 (which 
are similar to each other in structure) are implemented, the approximated function 
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being first expanded into the power (Taylor) series, / = J2^ =0 f^ k \0)x k /k\, and 
then the truncated series 



k 



(58) £/ (fc) (°)fr> 

fe=0 

consisting of the first N + 1 terms of the Taylor series (the value N is determined 
by the user) being transformed into a polynomial of the form (55) by means of the 
economization procedure. 

The algorithms implemented by means of the REDUCE system allow to obtain 
approximants in the form (1) or (52), estimates of the absolute and the relative 
error, and the error curves. The output includes the Fortran program of computing 
the corresponding approximant, the constants of rational arithmetic being trans- 
formed into the standard floating point form. When computing the values of the 
obtained approximant, this approximant can be transformed into the form most 
convenient for the user. For example, one can calculate values of the numerator 
and the denominator of the fraction of the form (1) according to the Horner scheme, 
and for the fraction of the form (52), according to Clcnshaw scheme, and transform 
the rational expression into a continued fraction or a Jacobi fraction as well. 

The ALGOL-like input language of the REDUCE system and convenient tools 
for solving problems of linear algebra guarantee simplicity and compactness of the 
programs. For example, the length of the program for computing linear Pade- 
Chebyshev approximants is sixty two lines. 

§14. The effect of error autocorrection for 
nonlinear pade-chebyshev approximations 

Relations (56) can be regarded as a system of equations for the coefficients of the 
approximant. Let the approximants R(x) = P(x)/Q(x) and R(x) — P(x)/Q(x), 
where P(x), P(x) are polynomials of degree n and Q(x), Q(x) are polynomials of 
degree m, be obtained by approximate solving the indicated system of equations. 
Consider the error approximant AP(x)/ AQ(x), where AP(x) — P(x) — P(x), 
AQ(x) = Q(x) — Q(x). Substituting R(x) and R(x) in (56) and subtracting one 
of the obtained expressions from the other, we see that the following approximate 
equality holds: 



id 



P(x) P(x) 



Tk{x)w(x) dx ~ 0, k = 0, 1, 



,Q(x) Q{x), 

— l 

This and equality (29) imply the approximate equality 



(59) j(^m_m)^ Tki 



x)w(x) dx 



where k = 0, 1, . . . ,m + n, w(x) = — x 2 . If the quantity AQ is relatively 

not small (this is connected with the fact that the system of equations (57) is 
ill-conditioned), then, as follows from equality (59), we can naturally expect that 
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the error approximant is close to P/Q and, consequently, to the approximated 
function f(x). 

Due to the fact that the arithmetic system of rational numbers is used, the soft- 
ware described in §13 allows to eliminate the round-off errors and to estimate the 
"pure" influence of errors in the approximated function on the coefficients of the 
nonlinear Pade-Chebyshev approximant. In this case the effect of error autocor- 
rection can be substantiated by a more accurate reasoning which is valid both for 
nonlinear Pade-Chebyshev approximants and for linear ones, and even for the lin- 
ear generalized Pade approximants connected with different systems of orthogonal 
polynomials. This reasoning is analogous to Y. L. Luke's considerations [5] given 
in §8 above. 

Assume that the function f(x) is expanded into series (51) and that the rational 
approximant R(x) = P(x)/Q(x) is looked for in the form (52). 

Let Abj be the errors in coefficients of the approximant's denominator Q. In 
the linear case these errors arise when solving the system of equations (53), and in 
the nonlinear case, when solving the system of equations (54). In both the cases 
the coefficients in the approximant's numerator are determined by equations (54), 
whence we have 

1 

(60) Aai = - Abj(c i+j + C|i_j|), i = 0, 1, . . . ,n. 

This implies the following fact: the error approximant AP/AQ satisfies the rela- 
tions 



(61) J (f(x)AQ(x)-AP(x))Ti(x)w(x)dx = Q, 



i = 0,l,...,n, 



which are analogous to relations (39) defining the linear Pade-Chebyshev approx- 
imants. Indeed, let us use the well-known multiplication formula for Chebyshev 
polynomials: 

(62) Ti{x)Tj{x) = \ [T i+j (x) + 2] W | (x)] , 

where i, j are arbitrary indices; see, for example [11-13, 33]. Taking (62) into 
account, the quantity fAQ — AP can be rewritten in the following way: 

]T' MjTA ( Ci tA - J2' Aa.T, 

j=0 ' ^ i=0 ' i=0 

oo r m _. n 

1 v-V X ^' . ^ ' 



2 

i=0 L j=0 

This formula and (60) imply that 



i=0 



fAQ-AP = 0(T n+1 ), 
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i.e., in the expansion of the function fAQ — AP into the series in Chcbyshev 
polynomials, the first n + 1 terms are absent, and the latter is equivalent to rela- 
tions (61) by virtue of the fact that the Chebyshev polynomials form an orthogonal 
system. When carrying out actual computations, the coefficients q are known only 
approximately, and thus the equalities (60), (61) are also satisfied approximately. 

Consider the results of computer experiments 2 that were performed by means 
of the software implemented within the framework of the REDUCE system and 
briefly described in §13 above. We begin with the example considered in §10 above, 
where the linear Pade-Chebyshev approximant of the form (44) to the function 
cos jx was constructed on the segment [—1, 1] for m = 2, n = 3. To construct the 
corresponding nonlinear Pade-Chebyshev approximant, it is necessary to specify 
the value of the parameter N determining the number of terms in the truncated 
Taylor series (58) of the approximated function. In this case the calculation error 
is determined, in fact, by the parameter N. 

The coefficients in approximants of the form (44) which are obtained for N = 15 
and N = 20 (the nonlinear case) and the coefficients in the error approximant are 
as follows 3 : 



a = 0.4960471034987563, 
a = 0.4960471027757504, 


Aa = 


io- 8 


• 0.07230059, 


ai = -0.1451091945278387, 








ai = -0.1451091928755736, 


Aai = 


-10" 


8 -0.16522651, 


a 2 = 10~ 2 • 0.5482586543334515, 








a 2 = 10~ 2 • 0.548258121085953, 


Aa 2 = 


-10" 


9 • 0.42224856, 


a 3 = -10~ 4 • 0.5205903601778259, 
a 3 = -10~ 4 • 0.5205902238186334, 


Aa 3 = 


-10" 


10 • 0.13635919 


b Q = 0.4960471034987759, 
b a = 0.4960471027757698, 


A&o = 


io- 8 


• 0.07230061, 


h = IQ- 2 ■ 0.7884201590727615, 
h = 10~ 2 • 0.7884203019999351, 


A&i = 


-10" 


10 • 0.1429272, 


b 2 = 10~ 4 • 0.4977097973870693, 
b 2 = 10~ 4 • 0.4977100977750249, 


Ab 2 = 


-10" 


10 • 0.300388. 



Both the approximants have absolute errors A equal to 0.4 • 10 -13 and the relative 
errors 5 equal to 0.6 • 10~ 13 , these values being close to the best possible. The 
condition number of the system of equations (57) in both the cases is 0.4 • 10 8 . The 
denominator AQ of the error approximant is zero for x = xo ~ 0.70752 . . . ; the 
point xo is also close to the root of the numerator AP which for x = xo is of order 
10~ 8 . Such a situation was considered in §10 above. Outside a small neighborhood 
of the point xo the absolute and the relative errors have the same order as in the 
"linear case" considered in §10. 



2 At the author's request, computer calculations were carried out by A. Ya. Rodionov. 
3 Here we have in mind the coefficients of the expansions of the approximant's numerator and 
denominator in powers of x. 
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Now consider the nonlinear Pade-Chcbyshcv approximant of the form (44) on the 
segment [—1, 1] to the function tg jx for m = n = 3. In this case the Taylor series 
converges very slowly, and, as the parameter N increases, the values of coefficients of 
the rational approximant undergo substantial (even in the first decimal digits) and 
intricate changes. The situation is illustrated in Table 2, where the following values 
are given: the absolute errors A, the absolute errors A of error approximants 4 
(there the approximants arc compared for TV = 15 and N = 20, for N — 25 and 
N = 35, for N — 40 and N = 50), and also the values of the condition number 
cond of the system of linear algebraic equations (57). In this case the relative errors 
coincide with the absolute ones. The best possible error is A m i n = 0.83 • 10~ 17 . 

Table 2 



N 



15 



20 



25 



35 



40 



50 



cond 0.76 ■ 10 7 0.95 ■ 10 s 0.36 ■ 10 10 0.12 ■ 10 12 
A 0.13 • 10~ 4 0.81 ■ 10~ 6 0.13 • 10~ 7 0.12 ■ lO" 10 



0.11 ■ 10 12 0.11 10 12 
0.75 ■ 10- 12 0.73 ■ 10- 15 



Ao 



0.7- 10~ 4 



0.7- 10" s 



0.2 ■ 10" 9 



4 A small neighborhood of the root of the polynomial AQ is eliminated as before. 
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§15. Small deformations of approximated functions 
and acceleration of convergence of series 



Let a function f(x) be expanded into the series in Chcbyshcv polynomials, 
f( x ) — J2ilo c iTi\ consider a partial sum 

N 



of this series. Using formula (62), it is easy to verify that the linear Pade-Chebyshev 
approximant of the form (1) or (52) to the function f(x) coincides with the linear 
Pade-Chebyshev approximant to polynomial (63) for N = n + 2m, i.e., it depends 
only on the first n + 2m + 1 terms of the Fourier-Chebyshev series of the function 
f(x); a similar result is valid for the approximant of the form (44) or (45) to even 
or odd functions, respectively. Note that for N = n + 2m the polynomial /jv is 
the result of application of the algorithm of linear (or nonlinear) Pade-Chebyshev 
approximation to f(x), the exponents m and n being replaced by and 2m + n. 

The interesting effect mentioned in [9] consists in the fact that the error of the 
polynomial approximant f n +2m depending on n + 2m + 1 parameters can exceed 
the error of the corresponding Pade-Chebyshev approximant of the form (1) which 
depends only on n + m + 1 parameters. For example, consider an approximant 
of the form (45) to the function tg jx on the segment [—1,1]. For m = n = 3 
the linear Pade-Chebyshev approximant to tg \x has the error of order 10~ 17 , and 
the corresponding polynomial approximant of the form (63) has the error of order 
10~ n . This polynomial of degree 19 5 can be regarded as a result of deformation 
of the approximated function tg \ x. This deformation does not affect the first 
twenty terms in the expansion of this function in Chcbyshcv polynomials and, 
consequently, does not affect the coefficients in the corresponding rational Pade- 
Chebyshev approximant, but leads to a several orders increase of its error. Thus, 
a small deformation of the approximated function can result in a sharp change in 
the order of error of a rational approximant. 

Moreover the effect just mentioned means that the algorithm extracts from 
polynomial (63) additional information concerning the next components of the 
Fourier-Chebyshev series. In other words, in this case the transition from Fouricr- 
Chebyshev series to Pade-Chebyshev approximant accelerates convergence of sc- 
ries. A similar effect of acceleration of convergence of power series by passing to 
the classical Pade approximant is known (see [11, 14, 15]). 

It is easy to see that the nonlinear Pade-Chebyshev approximant of the form (1) 
to the function f(x) depends only on the first m + n + 1 terms of the Fourier 
Chebyshev series for f(x), so that for such approximants a more pronounced effect 
of the type indicated above takes place. 

Since one can change the "tail" of the Fourier-Chebyshev series in a quite arbi- 
trary way without affecting the rational Pade-Chebyshev approximant, the effect 
of acceleration of convergence can take place only for the series with an especially 
regular behavior (and for the corresponding "nice" functions). 

Note that the effect of error autocorrection indicates to the fact that the variation 
of an approximated function under deformations of a more general type may have 



(63) 




I— 



O 



5 Odd functions are in question, and hence m = n = 3 in (45) corresponds to rn = 6, n = 7 
in (1). 
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little effect on the rational approximant considered as a function (whereas the 
coefficients of the approximant can have substantial changes). Accordingly, while 
deforming the functions for which good rational approximation is possible, the 
approximant 's error can rapidly increase. 

There arc interesting results distinguishing the classes of functions for which an 
efficient rational approximation is possible, for example, the classes of functions 
which are approximated by rational fractions considerably better (with a higher 
rate of convergence), then by polynomials; see, in particular, [10, 50-52]. The 
reasoning given above indicate that of a special interest are "individual" properties 
of functions which guarantee their effective rational approximation. There are 
reasons to suppose that solutions of certain functional and differential equations 
possess properties of this kind. Note that in papers [16, 37], starting from the fact 
that elementary functions satisfy simple differential equations, it is shown that these 
functions are better approximated by rational fractions than by polynomial ones 
(we have in mind the best approximation); because of complicated calculations only 
the following cases were considered: the denominator of a rational approximant is 
a linear function or (for even and odd functions) is a polynomial of degree 2. 

§16. Applications to computer calculation 

Ti computer calculation of function values is reduced in fact to carrying out a 
finite set of arithmetic operations with the argument and constants, i.e. to comput- 
ing the value of a certain rational function. Now we list some typical applications 
of methods for constructing rational approximants. Often it happens that a func- 
tion f(x) is to be computed many times (for example, when solving numerically a 
differential equation) and with a given accuracy. In this case the construction of a 
rational approximant to this function (with a given accuracy) often produces the 
most economic algorithm for computation of values of f(x). For example, if f(x) 
is a complicated aggregate of elementary and special functions every one which 
can be calculated using the corresponding standard programs, then values of the 
function f(x) can, of course, be computed by means of these programs. But such 
an algorithm is often too slow and produces an unnecessary extra precision. 

Standard computer programs for elementary and special functions, in their turn, 
are based, as a rule, on rational approximants. Note that although the accuracy of 
rational and polynomial approximants to a given function is the same, the compu- 
tation of the rational approximant usually requires a lesser number of operations, 
i.e., it is more speedy; see, for example [1, 3, 12, 13, 24, 25, 31]. 

The coefficients of rational approximants to basic elementary and special func- 
tions can be found in reference handbooks; we note especially the fundamental 
book [3], see, also, for example, [12, 13]. But a computer can have certain specific 
properties requiring algorithms and approximants (for effective standard programs 
of computing functions) which are absent in reference handbooks. In that case the 
construction programs for rational approximants, including the PADE program 
described in §9 above (see also [1, 7, 9]), can be useful. 

For example, decimal computers (including calculators) are widely used at pre- 
sent. The reason is that the use of decimal arithmetic system (instead of the 
standard binary one) enables the user to avoid a considerable loss of computing 
time needed for the transformation of numbers from the decimal representation 
to the binary one and vice versa. This is especially important if the amount of 
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the input/output operations is relatively large; the latter situation is characteristic 
for calculations in the interactive mode. A method of computing elementary func- 
tions on decimal computers which uses the technique of rational approximants is 
described in the Appendix below. The main idea of this method consists in the 
fact that the computation of values of various elementary functions, by means of 
simple algorithms, is reduced to the computation of a rational function of a fixed 
form. Roughly speaking, all basic elementary functions are calculated according 
to the same formula. Only the coefficients of the rational expression depend on a 
calculated function. 

§17. Nonlinear models and rational approximants 

One of the main problems of mathematical modeling is to construct analytic 
formulas (models) that approximately describe the functional dependence between 
different quantities according to given "experimental" data concerning the values 
of these quantities. In particular, let the set of real numbers x\,...,x v which 
are values of the "independent" variable x be given, and for every value Xi of 
this variable the value of the "dependent" variable y be given. The problem 
is to construct a function y = F(x) such that the functional dependence can be 
represented by an analytic formula of a certain form, and the approximate equality 
yi w F(xi) be valid for all i = 1,2, ...,v, where the function F(x) should take 
"reasonable values" at points x lying between the given points Xi. In practice the 
values yi are usually given with errors. 

As it was noted above, computer calculation of functions is finally reduced to 
computation of some rational functions. Thus in many cases it is natural to con- 
struct an analytic model in the form of the rational function (1), where the degrees 
of the numerator and the denominator and also the values of the coefficients are 
determined in the process of modeling, see [14]. Of course, in this case we have in 
mind only the one-factor models. One can construct multi- factor models by using 
rational functions of several variables. 

If we have a simple program of constructing rational approximants to continuous 
functions defined on finite segments of the real line, then we can reduce the con- 
struction of a model to constructing rational approximants to a continuous function 
(although in numerical analysis, as a rule, the goal is to reduce continuous problems 
to discrete ones). The construction of a model is carried out in two steps. On the 
first step a continuous function f(x) such that f(xi) = yi is constructed. A linear 
or a cubic spline (depending on the user's choice) is used as f(x). The function 
whose graph coincides with the polygonal line consisting of segments of straight 
lines that connect the points (xi,yi) with the coordinates Xi, yi is the linear spline; 
the cubic spline is described, for example, in [41]. On the second step the model is 
constructed by means of the PADE program. This approach guarantees the regular 
behavior of the model on the entire range of the argument. 

If there are reasons to assume that the initial data lie on a sufficiently smooth 
and regular curve, then it is expedient to use a cubic spline. And if there are 
reasons to assume that the initial data contain considerable errors or deviations 
from theoretically admissible data, then it is expedient to use a linear spline: the 
behavior of a cubic spline at intermediate points in this case will be irregular. 

The method for constructing models described above was implemented (together 
with I. A. Andreeva) as the SPLINE-PADE program. This program prints out the 
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graphs of splines and rational approximants (together with the initial data), and 
this facilitates the analysis of models. Of course, while choosing and analyzing 
models, it is necessary to take into account the theoretical requirements on the 
model which are connected with specific features of a particular problem. 

Example. Let the points Xi,...,x v be uniformly distributed on the segment 
[— f,f], %i = —j, v = 32, x v = j, yi = cosxi. The rational approximant of 
the form (45) to the linear spline for in = n = 2 gives an approximant to cos x on 
[— f , f ] with the absolute error A = 1CP 3 . If a cubic spline is applied, then the 
absolute error A is 0.35 • 10~ 6 in this case. 

Other approaches to the construction of models in the form of rational functions 
can be found, for example, in [14]. 

The above results connected with the effect of error autocorrection show that 
similar models can have quite different coefficients. Thus the coefficients of models 
of this kind are, generally speaking, unstable; and one should be very careful when 
trying to give a substantial interpretation for these coefficients. 

APPENDIX 

A METHOD OF IMPLEMENTATION OF BASIC 
CALCULATIONS ON DECIMAL COMPUTERS 

1. Introduction. A large relative amount of input/output operations is a char- 
acteristic feature of modern interactive computer systems. This results in a waste 
of computing time of systems with binary number representation: numbers are 
transformed from the decimal representation to the binary one and vice versa. 
Therefore, certain computers use decimal arithmetic system. As a rule, the use of 
decimal arithmetic system leads to a decrease in the rate of calculations and to ad- 
ditional memory requirements connected with specific coding of decimal numbers. 
The decrease in the rate of calculations is due to the fact that the implementation of 
decimal operations, as compared to that of binary ones, is more complicated; more- 
over, the binary representation is more convenient for implementing algorithms for 
calculating certain functions then the decimal one. Since the performance rate of 
floating point arithmetic operations and the rate of calculating elementary functions 
determine, to a considerable extent, the rate of mathematical data performing, the 
quality of the corresponding algorithms is, especially for cheap personal systems, 
of great importance. 

Here we consider methods of implementation of the floating point arithmetic 
system and of organizing computations for elementary functions. These methods 
are convenient to use on decimal computers (this pertains both to the software 
and hardware implementation). They guarantee a sufficient economy of memory 
simultaneously with a relatively high performance rate of calculations. Examples 
of effective software implementation of these methods are given in [1, 53]. The 
hardware implementation is described in the patent [55]. The methods under con- 
sideration are also of interest for octal and hexadecimal computers. 

2. Floating point arithmetic system. When carrying out arithmetic opera- 
tions with floating point numbers, the exponents of these numbers undergo only 
the operations of addition, subtraction, and comparison. Almost all computers 
have means for these operations since they are necessary for the command and 
the address codes operations. This fact provides an opportunity to use the binary 
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representation for the exponents when implementing the floating point arithmetic 
system. Since exponents are integers lying in certain bounds, the transformation 
of exponents from binary to decimal representations does not encounter serious 
obstacles. The choice of an appropriate algorithm depends on the structure of a 
computer and the method of coding of decimal numbers. For the standard coding 
8421, when each decimal digit corresponds to a binary tetrad, it is possible to use 
the fact that in this case the numbers from to 9 have the same coding in the 
binary and the decimal representations. Therefore the binary representation X2 of 
a number x can be converted into the decimal representation x\o by successively 
subtracting (in the binary arithmetic system) the numbers from to 9 from X2 and 
forming the number xio from the sums of these numbers (in the decimal arithmetic 
system). Similarly, a decimal integer can be converted into a binary one. 

Binary representation of exponents enables one to save memory, and the com- 
bination of decimal operations with more rapid binary operations of addition type 
enhances the performance rate. As a rule, the software implementation of the float- 
ing point arithmetic system leads to the fact that floating point operations take two 
orders as much time when compared with fixed point operations. The implemen- 
tation described in [1] is much more efficient: for seven decimal digit numbers, the 
transition from the fixed point to the floating point regimes results in double com- 
puting time for multiplication and division, and to reduction of the rate of addition 
and subtraction by one decimal order. 

3. The design of computation for elementary functions. The calculation of 
values of each of the basic elementary functions (at the reduction stage) is reduced 
to calculation of values of an odd function on a symmetric (with respect to the 
origin) interval. This odd function is approximated by a rational fraction of the 
form 

a + py z + y 4 

where y is the reduced argument, and the coefficients a, 6, c, a, (3 depend on the 
approximated function. Thus all algorithms of computation for basic elementary 
functions have the common block (1), and this fact guarantees an economy of 
memory. This block can be implemented both as a carefully devised part of software 
or as a part of hardware; this can enhance the performance rate. For the reduction 
algorithms described below, the approximant of the form (1) can guarantee 8-9 
accurate decimal digits. Because of specific features of a particular computer and 
the way the common block is implemented, it can be required that expression (1) 
be transformed into a certain form, for example, into the form 

r-t i\ ul \ a + y 2 (b + cy 2 ) 

a + y 2 ((3 + y 2 ) 

or into a Jacobi fraction of the form 
(1") R(y)=y(c+-^^ I 



y' + v + 



The calculation of elementary functions with enhanced precision is organized 
according to a similar scheme. The approximant of the form (1) is replaced by the 
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expression 



which can be transformed into the form similar to (1') or (1"), i.e., 
d") R(y) = y(d + 



J?l„\-nl,U i ) 

The coefficients a, b, c, d, a, (i, 7, £, 77, n, v, ds, A in formulas (1'), (1"), (2), (2'), 
(2") are constants that depend on the approximated function. The approximants 
of the form (2), (2') or (2") guarantee 12-13 accurate decimal digits 6 . 

The reduction algorithms are uniform; in particular, for calculations with or- 
dinary and enhanced precision the same reduction algorithms are used. These 
algorithms are described in section 4 below. The errors of approximants and values 
of the coefficients in expressions (1), (2) and in their modifications are given below. 
These coefficients are cither taken from [3], or calculated by means of the PADE 
program described in §9 above. 

4. Algorithms. The relative, mean relative, absolute, and mean absolute errors 
are denoted by S, 5, A, A, respectively. 

4.1. Calculation of logarithms. Let the argument x > have the form x = 
xo ■ 10 p , where 0.1 < x < 1, p is an integer. Suppose 

then we have 

1 1+77 

so = 10-3 . 

1-2/ 

whence 

lgx=p - -+lg- • 

2 1-y 

Substituting the approximant of the form R{y) with the best possible absolute error 
for the odd function 

1 + y _ 2 Arcth y 
g l-y ~ In 10 ' 
we finally obtain lg(a;) w p — 1/2 + R(y) for 

1 - Vw Vw- 1 
1 + VTo ^ v V10 + 1' 



6 Of course, the values of the argument for which the loss of precision is inevitable are an 
exception. For example, if x = 1 + Ax, then In a; « Ax, and the number of significant digits of In a; 
is smaller than the number of significant digits of the argument x by the number of zeros after 
the decimal point in the number Ax. 

37 



For 0.1 ^ x < 1 and ordinary precision, A = 0.23 • 10~ 8 , A = 0.14 • 10~ 8 . For 
enhanced precision, A = 0.85 • 10~ 12 , A = 0.53 • 10~ 12 . It is impossible to minimize 
the relative error on the given interval since this error is inevitable in a neighborhood 
of the point x = 1 . 

The calculation of the natural logarithm is reduced to the case of the decimal 
logarithm by means of the relation In a; = (In 10)(lgx). 

4.2. Calculation of exponentials. Consider a nonstandard (at the first sight) 
algorithm of reduction of the function lO 2 , which, nevertheless, is dual to the algo- 
rithm of reduction of \gx described above. Represent the argument x in the form 
x = y+p, where — 1 < y < 1, p is an integer (for example, —12.85 = — 12 + (— 0.85)). 
Then 

io- « i 0P i±#4, 

where R(y) is the approximant of the form (1) or (2) on the interval [—1,1] to the 
odd function th(xln 10/2). For — 1 ^ x ^ 1 and ordinary precision, 8 = 0.23- 10~ 8 , 
8 = 0.6 • 10~_ 9 , A = 0.23 • 10~ 7 , A = 0.17 • KT 8 . For enhanced precision, 8 = 
0.17 • 10~ 13 , 8 = 0.3 • 10~ 14 , A = 0.16 • 10~ 12 , A = 0.85 • 10~ 14 . 

For calculation of the functions e x and x v , the relations e x = 10 xl & e and x y = 
IQyigx are usec l. 

4.3. Calculation of sinx and cosx. Denote by R(x) the approximant of the 
form (1) or (2) with the best possible relative error to the function sinx on the 
segment [— x/2, tt/2]. Since the function sinx is odd, it is sufficient to consider the 
case x > 0. Denote by {a} the fractional part of a positive number a; for example, 
{12.08} = 0.08. Set y = {x/2n} ■ 2tt, then < y < 2ir. If y ^ tt/2, then we 
set sinx ~ R(y); if tt/2 < y ^ 37r/2, then sinx w R(z), where z = ir — y, and if 
37r/2 < y < 2tt, then sinx R(z), where z — y — 2n. For — tt/2 ^ x < tt/2 and 
ordinary precision, 8 = A = 0.53 • 10~ 8 , 5_ = 0.34 • 10" 8 , A = 0.21 • 10" 8 . For 
enhanced precision, 8 = A = 0.56 • 10~ 13 , 8 = 0.32 • 10" 13 , A = 0.2 • 10~ 13 . The 
calculation of cosx is reduced to the calculation of sinx by means of the relation 
cosx = sin(7r/2 — x). 

4.4. Calculation o/tgx. Let R(x) be the approximant of the form (1) or (2) 
with the best possible relative error to the function tgx on the segment [— 7r/4, 7r/4], 
the approximant (2) satisfying the additional condition d — 0. The algorithm of 
reduction is quite similar to the algorithm for sinx given above. For x > set 
y = {x/tt}tt; in this case < y < tt. Hence tgx R(y) for y ^ 7r/4; for 7r/4 < y < 
37r/4 we have tgx w 1/R(z), where z — n/2 — y; finally, for 37r/4 < y < n we have 
tgx « R(z), where z = y — tt. For x < we use the relation tg(— x) = — tgx. For 
-tt/4 < x < tt/4 and ordinary precision A = 8 = 0.22 • IO" 10 , A = 0.63 • 10" 11 , 
6_= 0.13 • IO" 10 . For enhanced precision, A = 8 = 0.26 • 10~ 13 , 8 = 0.15 • 10~ 13 , 
A = 0.67 • 10~ 14 . The algorithm for calculating tgx described above has essential 
advantages in accuracy and speed as compared with the algorithm using the relation 
tgx = sinx/ cosx and the algorithms for calculating sinx and cosx. 

4.5. Calculation o/arctgx. Let R(x) be the approximant of the form (1) or 
(2) with the best possible relative error to the function arctgx for |x| tg7r/8 = 
\[2 — 1. The reduction is standard: if ^ x < \f2 — 1, then arctgx w R(x); 
if \pl — 1 < x < 1, then arctgx w 7r/4 — R(y), where y = (1 — x)/(l + x); if 
x > 1, then arctgx w tt/2 — R(l/x); for x < the relation arctg(— x) = — arctgx 
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is used. For |x| s; \/2 — 1 and ordinary precision, 5 = 0.29 -10 9 , S = 0.18 • 10 9 , 
A = 0.11 • 10~ 9 ,_ A = 0.37 • 10- w . For enhanced precision, A = 0.11 • 10~ 13 , 
6 = 0.29 • 10~ 13 , 5 = 0.18 • 10~ 13 , A = 0.36 • 10~ 14 . 

4.6. Calculation o/arcsinx. Let R(x) be the approximant of the form (1) or 
(2) with the best possible relative error to the function arcsinx on the interval 
[—1/2, 1/2]. Since the function is odd, it is sufficient to consider the case x > 0. If 
< x ^ 1/2, then arcsinx w R(x); if 1/2 < x < 1, then arcsinx w 7r/2 — 2R(y), 
where y = a/(1 — x) /2. For —1/2 x ^ 1/2 and ordinary precision, 6 = 0.25 T0~ 8 , 
6 = 0.16 • 10_- 8 , A = 0.13 • lO- 8 , A = 0.41 • KT 9 ; for enhanced precision, 8 = 
0.82 • 10~ 12 , S = 0.52 • 10~ 12 , A = 0.43 • 10~ 12 , A = 0.13 • 10~ 12 . 

4.7. The reduction algorithms for lgx, arcsinx, and arctgx described above are 
taken from [3]. The reduction algorithms for sinx and tgx were proposed by the 
author and R. M. Borisyuk [53]. Of course, in particular cases the general scheme is 
supplemented by special ruses. For example, sinx, arcsinx, and arctgx are replaced 
by x for small values of the argument, and so on. 

5. Coefficients. For every function the coefficients of approximants that are 
used while computing values of this function are indicated below (see Table 3 and 
Table 4). For every function the coefficients of approximants (1), (1') and (1") are 
listed according to the following order: a, b, c, a, (5, 7, d, £, 77, ji, A, v, ce; the 
coefficients of approximants (2), (2') and (2") are listed according to the following 
order: a, b, c, d, a, (3, 7, £, ij, /j,, A, v, as; mantissas (significands) are separated 
from exponents by the letter D. The accuracy of the coefficients (16 decimal digits 
of the mantissa) is, of course, excessive. 

6. Analysis of the algorithms. It is easy to see that the algorithms of cal- 
culating trigonometric and inverse trigonometrical functions do not depend on on 
the arithmetic system of the computer. On the contrary, while implementing the 
computing algorithms for exponentials, logarithms and functions that are expressed 
through them (hyperbolic and inverse hyperbolic functions 7 , x v ) the binary arith- 
metic system has an essential advantage over the decimal one. For example, for 
binary arithmetic system the computation of the logarithm is reduced to finding an 
approximant on the segment [1/2, 1] (and not on the segment [1/10, 1]); since 1/2 
is much closer to zero than 1/10, this implies that the approximation rate increases 
considerably. While computing In x according to the scheme described above on a 
binary computer, the approximant of the form (1) which depending on five param- 
eters can be replaced by a more exact approximant (on a smaller segment) which 
depending only on three parameters. A similar situation arises while calculating an 
exponential. But the use of the decimal arithmetic system leads to a certain equi- 
librium between the difficulty of computing logarithmic and exponential functions, 
on one hand, and trigonometric functions, on the other. Thus in this case the use 
of a separate common block of the form (1) or (2) is justified. 

7. Implementation of algorithms for calculating elementary functions. 

For the software implementation it is expedient to use representations (1") and (2") 
for rational approximants in the form of Jacobi fractions; this allows to minimize 
the number of arithmetic operations. The rate of computation of functions can be 



7 Note that it is also convenient to use the common block of type (1) or (2) while calculating 
hyperbolic and inverse hyperbolic functions. 
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increased by implementing the calculation of Jacobi fractions mentioned above by 
means of the fixed point arithmetic system as described in [1] . 

A method of hardware implementation for algorithms under consideration is de- 
scribed in the patent [55]. In this case it is expedient to use representations (1') 
and (2') for rational approximants and to carry out computations of the fraction 
numerator and denominator in parallel. For example, when computing expres- 
sion (2'), the value y 2 being computed beforehand, it is possible to use the summa- 
tor to compute 7 + y 2 and the multiplier to compute d ■ y 2 simultaneously. Then 
y 2 is multiplied by (7 + y 2 ) and simultaneously the quantity c is added to d ■ y 2 , 
and so on. Under such an implementation, additional hardware requirements are 
minimal since almost all computers have a summator and a multiplier. 
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Tabic 3. 



Ordinary precision 



Igx 

0.3187822082024000D 01 

-0.2655808794660000D 01 

0.2668632700470000D 00 

0.3670115625115000D 01 

-0.4280973292830000D 01 

* * * 

0.2668632700470000D 00 

-0.1513374262751513D 01 

-0.1459257686092834D 01 

-0.2821715606737166D 01 

-0.4474945619843138D 00 
sin a; 

0.2051458702138878D 04 

-0.2731535822018325D 03 

0.6635500992122553D 01 

0.2051458712958973D 04 

0.6875599504020228D 02 

* * * 

0.6635500992122553D 01 

-0.7293840555054680D 03 

0.1585035693573942D 02 

0.5290563810446287D 02 

0.1212885465090180D 04 

arctg x 

0.4482500977985320D 01 

0.3372473379182700D 01 

0.2742666270116000D 00 

0.4482500979270910D 01 

0.4866639968788300D 01 

* * * 

0.2742666270116000D 00 

0.2037716450063295D 01 

0.1596444173439070D 01 

0.3270195795349230D 01 

-0.7381840442193144D 00 
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Table 4. Enhanced precision 



Igx 

-0.8625170319686105D 01 

0.1170031513942458D 02 

-0.3932918863942010D 01 

0.1960631750250080D 00 

-0.9930094301066197D 01 

0.1678051701465279D 02 

-0.8135425915212830D 01 

* * * 

0.1960631750250080D 00 

-0.2337861428824651D 01 

-0.4538004104289327D 01 

-0.2401159236894577D 01 

-0.1263136819683083D 01 

-0.2334284991240421D 01 

-0.9196001135281050D -01 
sin a; 

0.5896178692831105D 06 

-0.8390788060005352D 05 

0.2686819889613831D 04 

-0.2413382332334080D 02 

0.5896178692830811D 06 

0.1436176428157609D 05 

0.1669650188299142D 03 

* * * 

-0.2413382332334080D 02 

0.6716324155233251D 04 

0.1278518975583208D 03 

0.7154609949974411D 04 

0.4298163104533473D 02 

-0.3868509773741323D 01 

0.2372742417389960D 04 
arctg x 

0.1360093213361806D 02 

0.1696003978718046D 02 

0.5018932379116295D 01 

0.2013116125542811D 00 

0.1360093213361844D 02 

0.2149368383148177D 02 

0.9463307253423236D 01 

* * * 

0.2013116125542811D 00 

0.3113858735833039D 01 

0.5406247281512437D 01 

-0.3928353166591151D 01 

0.1338761180200926D 01 

0.2718298791709874D 01 

-0.1505853445310237D 00 



10 x 

0.2416631060448244D 04 

0.4101315802439533D 03 

0.1179791485292238D 02 

0.4067164397089984D -01 

0.2099059068697363D 04 

0.1283652206816926D 04 

0.8569022438348670D 02 

* * * 

0.4067164397089984D -01 

0.8312752555010688D 01 

0.4263308628906700D 02 

- 0.8324501520959055D 03 
0.1184109379926876D 02 
0.3121604429515093D 02 

- 0.8918843336784789D 02 

tgx 

- 0.1025905306253198D 05 
0.1244879443625433D 04 

- 0.2085510846487616D 02 

0.0 

- 0.1025905306253223D 05 
0.4664563797829625D 04 

- 0.2078359665454338D 03 

* * * 

0.0 

- 0.2085510846487616D 02 

- 0.1481441435217348D 03 

- 0.4670350587972472D 04 

- 0.1340714587130577D 02 

- 0.4628467715239315D 02 

- 0.1286250294831805D 03 

arcsin x 

- 0.1650992722517542D 02 
0.2211306741883449D 02 

- 0.7427989795595475D 01 
0.4054985920387412D 00 

- 0.1650992722516183D 02 
0.2486472195166319D 02 

- 0.1033386531178065D 02 

* * * 

0.4054985920387412D 00 

- 0.3237621961350434D 01 

- 0.6618033809642488D 01 

- 0.2758376635378124D 01 

- 0.1288186871451920D 01 

- 0.2427644630686242D 01 

- 0.9565986684443910D -01 
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